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Manual Update 


Document Title: Third-Generation TMS320 User’s Guide, SPRUO31 


Document Number: SPRZ048 ECN Number: 526635 


This Manual Update should be appended to the Third-Generation TMS320 User’s Guide. Changes 
should be made as indicated on the designated pages. 


Page Change or Add 
2-2 Table 2-1: 
Line Function (Now) Function (Should Be) 
1 X11 XA11 
2 X12 XA12 
5 XOD2 XD2 
20 1OA5 XA5 
26 1IOD23 XD23 
27 lOD24 XD24 
28 1OD25 XD25 
28 VSUBS SUBS 
29 10D26 XD26 
30 10D27 XD27 
31 10D28 XD28 
32 10D29 XD29 
33 1IOD30 XD30 
34 10D31 XD31 
35 IORDY XRDY 
2-6 Table 2-2. Insert the following at the end of the table. 


LOCATOR (1 PIN) 


7-9 Line 7: src should be dst. 

A-5 Table A-5: Characteristics (13), (14), (15), (16), (17), and (18) change (IO) to (X) 
in name and description. 

A-6 Figure A-4: Change (IO)R/W to (X)R/W, (IO)A to (X)A, (1O)D to (X)D, and 
(IO)RDY to (X)RDY. 

A-6 Table A-6: All characteristics change (IO) to (X) in name and description. 

A-7 Figure A-5: Change (IO)R/W to (X)R/W, (IO)A to (X)A, (10)D to (X)D, and 
(IO)RDY to (X)RDY. 

A-8 Figure A-6: Change IOR/W to XR/W, IOA to XA, IOD to XD, and IORDY to 
XRDY. Change (M)STRB in title to IOSTRB. 

A-9 Table A-7: Characteristics (22), (14.1), (15.1), (16.1), (17.1), and (18.1) change IO 


to X in name and description. 
A-9 Table A-8: All characteristics change |O to X in name and description. 


The changes shown in this Manual Update will be included in the next revision of the Third-Generation 
TMS320 User’s Guide. 
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Section 1 


Introduction 


The TMS320C30 (third-generation) Digital Signal Processor (DSP) is a 
high-performance CMOS 32-bit device in the TMS320 family of single-chip 
digital signal processors. Since 1982 when the TMS32010 was introduced, 
the TMS320 family has established itself as the industry standard for digital 
signal processing. Powerful instruction sets, high-speed number-crunching 
capabilities, and innovative architectures have made this high-performance 
family of processors ideal for DSP applications. 


The TMS320 family consists of three generations of processors: TMS320C1x, 
TMS320C2x, and TMS320C3x (see Figure 1-1). The family has expanded to 
include enhancements of earlier generations and more powerful new gener- 
ations of digital signal processors. 


TMS320C3x 


320C30 @ 32-bit float-pt CPU 
@ 60-ns instr cycle 
@ 2K W RAM 
@ 4K W ROM 
@ 64 W instr cache 
TM320C2x @ 16M W total mem 
@ 32 x 32=40-bit mult 
32020 © 16/32-bit CPU @ 2 serial ports 
320C25 @ 100-ns instr cycle e 
@ S44 W data RAM e 
4K W prog ROM 
128K W total mem 


2 timers 
DMA 


TMS320C1x 


32010 @ 16/32-bit CPU 

32011 @ 160-ns instr cycle 
320C10 @ 256 W data RAM 
320C15 @ 4K W ROM/EPROM 
320E15 @ 4K W ext prog mem 
320C17 @ 16 x 16=32-bit mult 
320E17 @ 2 serial ports 
Companding H/W 
Coprocessor I/F 


16 x 16=32-bit mult 
Serial port and timer 
Block move/repeat 
Multiprocessor I/F 


PERFORMANCE 


TIME 


Figure 1-1. TMS320 Device Evolution 
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This document discusses the third-generation device, TMS320C30, within the 
TMS320 family. The 60-ns cycle time of the TMS320C30 allows it to execute 
operations at a performance rate previously available only on a supercomputer. 
Even higher performance is gained through its large on-chip memories, con- 
current DMA controller, and instruction cache. 


This section presents the following major topics: 


General Description (Section 1.1 on page 1-3) 

Key Features (Section 1.2 on page 1-4) 

Typical Applications (Section 1.3 on page 1-5) 
How To Use This Manual (Section 1.4 on page 1-6) 


References (Section 1.5 on page 1-8) 


Introduction - General Description 


1.1 General Description 


The TMS320’s internal busing and special digital signal processing (DSP) 
instruction set provide speed and flexibility. This combination produces a pro- 
cessor family capable of executing up to 33 MFLOPS (million floating-point 
Operations per second). The TMS320 family optimizes speed by implementing 
functions in hardware that other processors implement through software or 
microcode. This hardware-intensive approach provides the design engineer 
with power previously unavailable on a single chip. 


The TMS320C30, the third-generation device in the TMS320 family, can 
perform parallel multiply and ALU operations on integer or floating-point data 
in a single cycle. The processor also possesses a general-purpose register file, 
program cache, dedicated auxiliary register arithmetic units (ARAU), internal 
dual-access memories, one DMA channel supporting concurrent !/O, and a 
short machine-cycle time. High performance and ease of use are achieved 
through greater parallelism, greater accuracy, and general-purpose features. 


General-purpose applications are greatly enhanced by the large address space, 
multiprocessor interface, internally and externally generated wait states, two 
timers, two serial ports, and multiple interrupt structure. The TMS320C30 
supports a wide variety of system applications from host processor to dedi- 
cated coprocessor. 


The emphasis on total system cost has resulted in a less-expensive processor 
that can be designed into systems currently using costly bit-slice processors. 
High-level language is more easily supported through a register-based archi- 
tecture, large address space, powerful addressing modes, flexible instruction 
set, and support of floating-point arithmetic. 
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1.2 Key Features 


Some key features of the TMS320C30 are listed below. 
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60-ns single-cycle instruction execution time 
- 33.3 MFLOPS (million floating-point operations per second) 
- 16.7 MIPS (million instructions per second) 


One 4K x 32-bit single-cycle dual-access on-chip ROM block 
Two 1K x 32-bit single-cycle dual-access on-chip RAM blocks 
64 x 32-bit instruction cache 

32-bit instruction and data words, 24-bit addresses 

40/32-bit floating-point/integer multiplier and ALU 

32-bit barrel shifter 

Eight extended-precision registers (accumulators) 


Two address generators with eight auxiliary registers and two auxiliary 
register arithmetic units 


On-chip Direct Memory Access (DMA) controller for concurrent |/O and 
CPU operation 


Integer, floating-point, and logical operations 

Two- and three-operand instructions 

Parallel ALU and multiplier instructions in a single cycle 
Block repeat capability 

Zero-overhead loops with single-cycle branches 
Conditional calls and returns 

Interlocked instructions for multiprocessing support 
Two serial ports to support 8/16/32-bit transfers 

Two 32-bit timers 

Two general-purpose external flags, four external! interrupts 
180-pin grid array (PGA) package; 1 » m CMOS 
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1.3 Typical Applications 


The TMS320 family’s unique versatility and realtime performance offer flexible 
design approaches in a variety of applications. In addition, TMS320 devices 
can simultaneously provide the multiple functions often required in those 
complex applications. Table 1-1 lists typical TMS320 family applications. 


Table 1-1. Typical Applications of the TMS320 Family 


GENERAL-PURPOSE DSP GRAPHICS/IMAGING INSTRUMENTATION 


Digital Filtering 
Convolution 
Correlation 

Hilbert Transforms 

Fast Fourier Transforms 
Adaptive Filtering 
Windowing 

Waveform Generation 


3-D Rotation 

Robot Vision 

Image Transmission/ 
Compression 

Pattern Recognition 
Image Enhancement 
Homomorphic Processing 
Workstations 

Animation/ Digital Map 


Spectrum Analysis 
Function Generation 
Pattern Matching 
Seismic Processing 
Transient Analysis 
Digital Filtering 
Phase-Locked Loops 


VOICE/SPEECH CONTROL MILITARY 


Voice Mail 

Speech Vocoding 
Speech Recognition 
Speaker Verification 
Speech Enhancement 
Speech Synthesis 
Text-to-Speech 
Neural Networks 


Disk Control 

Servo Control 

Robot Control 

Laser Printer Control 
Engine Control 
Motor Control 
Kalman Filtering 


Secure Communications 
Radar Processing 

Sonar Processing 

Image Processing 
Navigation 

Missile Guidance 

Radio Frequency Modems 
Sensor Fusion 


TELECOMMUNICATIONS AUTOMOTIVE 


Echo Cancellation 
ADPCM Transcoders 
Digital PBXs 

Line Repeaters 
Channel Multiplexing 


1200 to 19200-bps Modems 


Adaptive Equalizers 
DTME Encoding/Decoding 
Data Encryption 


FAX 

Cellular Telephones 
Speaker Phones 
Digital Speech 
interpolation (DSI) 
X.25 Packet Switching 
Video Conferencing 
Spread Spectrum 
Communications 


Engine Control 
Vibration Analysis 
Antiskid Brakes 
Adaptive Ride Control 
Global Positioning 
Navigation 

Voice Commands 
Digital Radio 

Cellular Telephones 


[_ConsuMER INDUSTRIAL MEDICAL 


Radar Detectors 
Power Tools 

Digital Audio/TV 
Music Synthesizer 
Toys and Games 
Solid-State Answering 
Machines 


Robotics 

Numeric Control 
Security Access 
Power Line Monitors 
Visual Inspection 
Lathe Control 

CAM 


Hearing Aids 

Patient Monitoring 
Ultrasound Equipment 
Diagnostic Tools 
Prosthetics 

Fetal Monitors 

MR Imaging 
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1.4 How To Use This Manual 


The purpose of this user’s guide is to serve as a reference book for the 
TMS320C30 digital signal processor. This document is designed to provide 
information that assists managers and hardware/software engineers in appli- 
cation development. The first group of sections provides specific information 
about the architecture and hardware operation of the device. Later sections 
describe the software operation. Specific software and hardware applications 
are provided in Sections 12 and 13, respectively. Electrical specifications and 
mechanical data can be found in the data sheet (Appendix A). 
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The following table lists each section and briefly describes the section con- 


tents. 


Section 2. 


Section 3. 


Section 4. 


Section 5. 


Section 6. 


Section 7. 


Section 8. 


Section 9. 


Section 10. 


Section 11. 


Section 12. 


Pinout and Signal Descriptions. Drawing of the PGA 
package for the TMS320C30. Functional listing of the 
signals, their pin locations, and descriptions. 


Architectural Overview. Functional block diagram. 
TMS320C30 design description, hardware components, 
and device operation. Instruction set summary. 


CPU Registers, Memory, and Cache. Description of the 
registers in the CPU register file. Memory maps provided 
and instruction cache architecture, algorithm, and control 
bits explained. 


Data Formats and Floating-Point Operations. Description 
of signed and unsigned integer and floating-point formats. 
Discussion of floating-point multiplication, addition, sub- 
traction, normalization, rounding, and conversions. 


Addressing. Operation, encoding, and implementation of 
addressing modes. Format descriptions. System stack 
management. 


Program Flow Control. Software control of program flow 
with repeat modes and branching. Interlocked operations. 
Reset and interrupts. 


External Bus Operation. Description of primary and expan- 
sion interfaces. External interface timing diagrams. Pro- 
grammable wait-states and bank switching. 


Peripherals. Description of the DMA controller, timers, and 
serial ports. 


Pipeline Operation. Discussion of the pipelining of oper- 
ations on the TMS320C30. 


Assembly Language Instructions. Functional listing of in- 
structions. Condition codes defined. Alphabetized indi- 
vidual instruction descriptions with examples. 


Software Applications. Software application examples for 
the use of various TMS320C30 instruction set features. 
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Section 13. Hardware Applications. Hardware design techniques and & 
application examples for interfacing to memories, periph- 
erals, or other microcomputers/microprocessors. 


Four appendices are included to provide additional information. 


Appendix A. TMS320C30 Data Sheet. Electrical specifications, timing, 
and mechanical data. 


Appendix B. Development Support/Part Order Information. Listings of 
the hardware and software available to support the 
TMS320C30 device. 


Appendix C. Instruction Opcodes. List of the opcode fields for all the 
TMS320C30 instructions. 


Appendix D. Quality and Reliability. Discussion of Texas Instruments 
quality and reliability criteria for evaluating performance. 
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1.5 References 


The following reference list contains useful information regarding functions, 
operations, and applications of digital signal processing. These books also 
provide other references to many useful technical papers. The reference list is 
organized into categories of general DSP, speech, image processing, and di- 
gital control theory, and alphabetized by author. 


General Digital Signal Processing: 


Antoniou, Andreas, Digital Filters: Analysis and Design. New York, NY: 
McGraw-Hill Company, Inc., 1979. 


Brigham, E. Oran, The Fast Fourier Transform. Englewood Cliffs, NJ: 
Prentice-Hall, Inc., 1974. 


Burrus, C.S. and Parks, T.W., DFT/FFT and Convolution Algorithms. 
New York, NY: John Wiley and Sons, Inc., 1984. 


Digital Signal Processing Applications with the TMS320 Family, Texas 
Instruments, 1986; Prentice-Hall, Inc., 1987. 


Gold, Bernard and Rader, C.M., Digital Processing of Signals. New 
York, NY: McGraw-Hill Company, Inc., 1969. 


Hamming, R.W., Digita/ Filters. Englewood Cliffs, NJ: Prentice-Hall, 
Inc., 1977. 


IEEE ASSP DSP Committee (Editor), Programs for Digital Signal Pro- 
cessing. New York, NY: IEEE Press, 1979. 


Jackson, Leland B., Digital Filters and Signal Processing. Hingham, MA: 
Kluwer Academic Publishers, 1986. 


Jones, D.L. and Parks, T.W., A Digital Signal Processing Laboratory 
Using the TMS32070. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1987. 


Lim, Jae and Oppenheim, Alan V. (Editors), Advanced Topics in Signal 
Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1988. 


Morris, L. Robert, Digital Signal Processing Software. Ottawa, Canada: 
Carleton University, 1983. 


Oppenheim, Alan V. (Editor), Applications of Digital Signal Processing. 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1978. 


Oppenheim, Alan V. and Schafer, R.W., Digital Signal Processing. En- 
glewood Cliffs, NJ: Prentice-Hall, Inc., 1975. 


Oppenheim, Alan V. and Willsky, A.N. with Young, I.T., Signals and 
Systems. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1983. 


Parks, T.W. and Burrus, C.S., Digital Filter Design. New York, NY: John 
Wiley and Sons, Inc., 1987. 


Rabiner, Lawrence R., Gold and Bernard 7heory and Application of 
Digital Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 
1975. 
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Treichler, J.R., Johnson, Jr., C.R., and Larimore, M.G., Theory and De- 
sign of Adaptive Filters. New York, NY: John Wiley and Sons, Inc., 
1987. 


Speech: 


Gray, A.H. and Markel, J.D., Linear Prediction of Speech. New York, 
NY: Springer-Verlag, 1976. 


Jayant, N.S. and Noll, Peter, Digital Coding of Waveforms. Englewood 
Cliffs, NJ: Prentice-Hall, Inc., 1984. 


Papamichalis, Panos, Practical Approaches to Speech Coding. Engle- 
wood Cliffs, NJ: Prentice-Hall, Inc., 1987. 


Rabiner, Lawrence R. and Schafer, R.W., Digital Processing of Speech 
Signals. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1978. 


Image Processing: 


Andrews, H.C. and Hunt, B.R., Digital Image Restoration. Englewood 
Cliffs, NJ: Prentice-Hall, Inc., 1977. 


Gonzales, Rafael C. and Wintz, Paul, Digital /mage Processing. Reading, 
MA: Addison-Wesley Publishing Company, Inc., 1977. 


Pratt, William K., Digital Image Processing. New York, NY: John Wiley 
and Sons, 1978. 


Digital Control Theory: 


Jacquot, R., Modern Digital Control Systems. New York, NY: Marcel 
Dekker, Inc., 1981. 


Katz, P., Digital Control! Using Microprocessors. Englewood Cliffs, NJ: 
Prentice-Hall, Inc., 1981. 


Kuo, B.C., Digital Contro/ Systems. New York, NY: Holt, Reinholt and 
Winston, Inc., 1980. 


Moroney, P., /ssues in the Implementation of Digital Feedback Com- 
pensators. Cambridge, MA: The MIT Press, 1983. 


Phillips, C. and Nagle, H., Digital Contro/ System Analysis and Design. 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1984. 
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Section 2 


Pinout and Signal Descriptions 


The TMS320C30 (third-generation TMS320) digital signal processor is 
available in a 180-pin grid array (PGA) package. The pinout of this package 
(Figure 2-1), and a functional listing of the signals, pin locations, and de- 
scriptions are provided in this section. Electrical specifications and mechanical 
data are given in the data sheet (Appendix A). 
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Figure 2-1. TMS320C30 Pin Assignments 
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Pinout and Signal Descriptions 


Table 2-1. TMS320C30 Pin Function Assignments 


| Pin || Function | Pin || Function | Pin || Fun 
A 


A9 


AO 


LOCATOR 
EMU4 _ 
MC/MP 
MSTRB 
EMU6 


lODVDD 


lODVDD 
MDVDD 


P13 


Q15 PDVDD 

P15 VDD D8 

D2 VDD H4 VSS C8 
lOR/W D1 STRB F2 VDD H12 |} VSS H3 RSV6 
IOSTRB F4 TCLKO P4 VDD M8 VSS H13 |! RSV8 


1) ADVDD, DDVDD, lODVDD, MDVDD, and PDVDD pins (D4, D12, E8, H5, H11, L8, M4, and M12) 
are on a common plane internal to the device. 


2) VDD pins (D8, H4, H12, and M8) are on a common plane internal to the device. 


3) VSS, CVSS, and INSS pins (B2, B14, C8, H3, H13, N8, and P14) are on a common plane internal 
to the device. 


4) DVSS pins (C3, C13, N3, and N13) are on a common plane internal to the device. 
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2.1 Signal Descriptions 


The signal descriptions for the TMS320C30 device in the microprocessor 
mode are provided in this section. Table 2-2 lists each signal, the number of 
pins, function, and operating mode(s), i.e., input, Output, or high-impedance 
state as indicated by I, O, or Z. All pins labelled ‘NC’ are not to be connected 
by the user. A line over a signal name (e.g., RESET) indicates that the signal 
is active low true at a logic ‘0’ level. The signals in Table 2-2 are grouped 
according to function. 


Table 2-2. TMS320C30 Signal Descriptions 


SIGNAL | #PINS | 1/0/zt DESCRIPTION 


PRIMARY BUS INTERFACE (61 PINS) 


D(31-0) [/O/Z | 32-bit data port of the primary bus interface. 
A (23-0) 24-bit address port of the primary bus interface. 


Read/write signal for primary bus interface. This pin is high 
when a read is performed and low when a write is performed 
over the parallel interface. 


sire a ee a ae External access strobe for the primary bus interface. 


Ready signal. This pin indicates that the external device is 
prepared for a primary bus interface transaction to complete. 
As long as RDY is a logic high, the data and address buses 
of the primary bus interface remain valid. 


Hold signal for primary bus interface. When HOLD is a logic 
low, any ongoing transaction is completed. The A(23-0), 
D(31-0), STRB,, and R/W signals are placed in a high-im- 
pedance state, and all transactions over the primary bus in- 
terface are held until HOLD becomes a logic high. 


Hold acknowledge signal for primary bus interface. This 
signal is generated in response to_a logic low_on HOLD. It 
signals that A(23-0), D(31-0), STRB, and R/W are placed 
in a high-impedance state and all transactions over the bus 
will be held. HOLDA will be high in response to a logic 
high of HOLD. 


EXPANSION BUS INTERFACE (49 PINS) 


XD (31-0) | 32 ~~ |—-1/0/Z_| 32-bit data port of the expansion bus interface. 


XR/W 0/Z Read/write signal for expansion bus interface. When a read 
is performed, this pin is held high; when a write is per- 
formed, this pin is low. 

MSTRB , te eS External memory access strobe for the expansion bus inter- 
face. 


IOSTRB a ee a) ae External |/O access strobe for the expansion bus interface. 


XRDY fe | Ready signal. This pin indicates that the external device is 


prepared for an expansion bus interface transaction to 
t Input, Output, High-impedance state. 


complete. As long as XRDY is high, the data and address 
buses of the expansion bus interface remain valid. 
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XA (12-0) ee a) 13-bit address port of the expansion bus interface. | 


Pinout and Signal Descriptions 


CLKXO 


Table 2-2. TMS320C30 Signal Descriptions (Continued) 
SIGNAL | #PINS | 1/O/zt DESCRIPTION | 
A A IE SRE ENRON SONY GO NONI TD AARC SIRE TRE ae PN A eS TEL SINS SERIO LANE RESOURCE Oe RON NO I II 
Reset. When this pin is a logic low, the device is placed in 
the reset condition. When reset becomes a logic high, exe- 
INT(3-0) [| 4 | Externalinterrupts, 
IACK Interrupt acknowledge signal. [ACK goes low during exe- 
the beginning or end of an interrupt service routine. 
XF(1-0) 2 1/0 External flag pins. These pins are formatted as 1/O through 
output pins. They are used as general-purpose |!/O pins or 
to support interlocked processor instructions. 
1/0 Serial port O transmit clock. This pin serves as the serial shift 
clock for the serial port 0 transmitter. 
0/Z Data transmit output. Serial port 0 transmits serial data on 
this pin. 
1/0 Serial port 0 receive clock.This pin serves as the serial shift 
clock for the serial port O receiver. 
az Data receive. Serial port O receives serial data via the DRO 
pin. 
SERIAL PORT 1 SIGNALS (6 PINS) 
Lo 1/0 Serial port 1 transmit clock. This pin serves as the serial shift 
1 0/Z Data transmit output. Serial port 1 transmits serial data on 
this pin. 
Frame synchronization pulse for transmit. The FSX1 pulse 


CONTROL SIGNALS (9 PINS) 

RESET 
cution begins from the location specified by the reset vector. 
cution of an JACK instruction. This can be used to indicate 
a program instruction, and latched internally when used as 

SERIAL PORT 0 SIGNALS (6 PINS) 

FSX0O | 1/0 Frame synchronization pulse for transmit. The FSXO pulse 

initiates the transmit data process over pin DX0O. 
a tiates the receive data process over DRO. 

clock for the serial port 1 transmitter. 
initiates the transmit data process over pin DX1. 


T Input, Output, High-impedance state. 
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Table 2-2. TMS320C30 Signal Descriptions (Continued) 


| signat | #PINS | Vo/zt| DESCRIPTION, 


TIMER O SIGNALS (1 PIN) 


TCLKO 1/0 Timer clock. As an input, TCLKO is used by timer 0 to count 
external pulses. As an output pin, TCLKO outputs pulses 
generated by timer 0. 


TIMER 1 SIGNALS (1 PIN) 


| TCLK1 1/O Timer clock. As an input, TCLK1 is used by timer 1 to count 
external pulses. As an output pin, TCLK1 outputs pulses 
generated by timer 1. 


SUPPLY AND OSCILLATOR SIGNALS (29 PINS) 


VBB pump oscillator output. 
Output pin from the internal oscillator for the crystal. If a 
crystal is not used, this pin should be left unconnected. 


aa ae Input pin to the internal oscillator from the crystal or a clock. 


VBBP 


eee a 
ae el 
Lectlee 
suas | 
aa 


One ground pin. 


Substrate pin. Tie to ground. 


x< 


X2/CLKIN 


Vop@-0) [4 | 1 | Four+Svsupplypins SSCS 
iODVpp(t.0)| 2 | 1 | Two +5 Vsupplypins. ——SCSC~S~S~S~S~S 
ADVpp(1.0) |_2 | 1 | Two +5 Vsupply pins. ——SOSCSC~S~S~S~S 
PDVop | 1 | 1 | One +5 Vsupplypin. ——~S~C~S~S 
DV) | 2 | 1 | Two #5 Vsupply pins. ——SOSCSC~S~—~S~S 
MDVpp | __1 | 1 | One +5 Vsupplypin. SSS 
Vsg(3-0) | 4 | 1 | Fourground pins. —~—SC~S~—~S~S 
DVeq(3-0) [1 rourground pins. ——SOSCSC—~—SS 
CVss(t.0) [1 [Hiwo ground pins. —SOOSCSCSCS—CSCSCSCS 
Vss eral 


External H1 clock. This clock has a period equal to twice 
CLKIN. 


External H3 clock. This clock has a period equal to twice 
CLKIN. 


T Input, Output, High-impedance state. 


Pinout and Signal Descriptions 


Table 2-2. TMS320C30 Signal Descriptions (Concluded) 
[SIGNAL | #PINS | vot] DESCRIPTION. SS 


RESERVED (18 PINS) 
PEMU(O-2) | 3 | __t__‘{_ Reserved. Use pull-ups to +5 volts. See Section 13.5 
FEMU3—— | 1 | Reserved. SeeSection13.5 
pemu4— | 1 | tt Reserved. Tieto+5 volts, 
|Rsv(o-10) [| 11 | |_| Reserved. Tieto+S volts, 


T Input, Output, High-impedance state. 


The user must follow the connections specified for the reserved pins. All pull-up resistors must be 20 k 
ohms. All +5 volt supply pins must be connected to a common supply plane and all ground pins must 
be connected to a common ground plane. 
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Section 3 


Architectural Overview 


Emphasis on hardware and software system solutions to demanding arithmetic 
algorithms has resulted in the TMS320C30 architecture shown in Figure 3-1. 
High system performance is achieved through the accuracy and precision of 
the floating-point units, large on-chip memory, a high degree of parallelism, 
and the DMA controller. 


This section provides an architectural overview of the TMS320C30 processor. 
Major areas of discussion are listed below. 


8 Central Processing Unit (CPU) (Section 3.1 on page 3-3) 
= Floating-point/integer multiplier 
7 ALU for floating-point, integer, and logical operations 
- Auxiliary register arithmetic units (ARAUs) 
= CPU register file 


@ Memory Organization (Section 3.2 on page 3-7) 
7 RAM, ROM, and cache 
= Memory maps 
= Memory addressing modes 
= Instruction set summary 


@ Internal Bus Operation (Section 3.3 on page 3-18) 
@ External Bus Operation (Section 3.4 on page 3-19) 


@ Peripherals (Section 3.5 on page 3-20) 
= Timers 
= Serial ports 


e Direct Memory Access (DMA) (Section 3.6 on page 3-21) 


Architectural Overview 
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Figure 3-1. TMS320C30 Block Diagram 
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Architectural Overview - Central Processing Unit (CPU) 


3.1 Central Processing Unit (CPU) 


The TMS320C30 has a register-based CPU architecture. The CPU consists 
of the following components: 


@ Floating-point/integer multiplier 
ALU for performing arithmetic (floating-point, integer)and logical oper- 3 
) 


ations 

32-bit barrel shifter 

Internal buses (CPU1/CPU2 and REG1/REG2) 
Auxiliary register arithmetic units (ARAUs) 
CPU register file. 


Figure 3-2 shows the various CPU components that are discussed in the 
succeeding subsections. 
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Figure 3-2. Central Processing Unit (CPU) 
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Architectural Overview - Central Processing Unit (CPU) 


3.1.1 Multiplier 


The multiplier performs single-cycle multiplications on 24-bit integer and 
32-bit fioating-point values. The TMS320C30 implementation of floating- 
point arithmetic allows for floating-point operations at fixed-point speeds via 
a 60-ns instruction cycle and a high degree of parallelism. To gain even higher 
throughput, a multiply and ALU operation can be performed in a single cycle 
by using parallel instructions. 


When performing floating-point multiplication, the inputs are 32-bit float- 
ing-point numbers, and the result is a 40-bit floating-point number. When 
performing integer multiplication, the input data is 24 bits and yields a 32-bit 
result. Refer to Section 5 for detailed information on data formats and float- 
ing-point operation. 


3.1.2 Arithmetic Logic Unit (ALU) 


The ALU performs single-cycle operations on 32-bit integer, 32-bit logical, 
and 40-bit floating-point data, including single-cycle integer and floating- 
point conversions. Results of the ALU are always maintained in 32-bit integer 
or 40-bit floating-point formats. The barrel shifter is used to shift up to 32 
bits left or right in a single cycle. 


internal buses, CPU1/CPU2 and REG1/REG2, carry two operands from me- 
mory and two operands from the register file, thus allowing parallel multiplies 
and adds/subtracts on four integer or floating-point operands in a single cycle. 


3.1.3 Auxiliary Register Arithmetic Units (ARAUs) 


Two auxiliary register arithmetic units (ARAUO and ARAU1) can generate two 
addresses in a single cycle. The ARAUs operate in parallel with the multiplier 
and ALU. They support addressing with displacements, index registers (IRO 
and IR1), and circular and bit-reversed addressing. Refer to Section 6 for a 
description of addressing modes. 


3.1.4 CPU Register File 


The TMS320C30 provides 28 registers in a multiport register file that is tightly 
coupled to the CPU. All of these registers can be operated upon by the mul- 
tiplier and ALU, and can be used as general-purpose registers. However, the 
registers also have some special functions for which they are more suited than 
others. For example, the eight extended-precision registers are especially 
suited for maintaining extended-precision floating-point results. The eight 
auxiliary registers support a variety of indirect addressing modes and can be 
used as general-purpose 32-bit integer and logical registers. The remaining 
registers provide system functions such as addressing, stack management, 
processor status, interrupts, and block repeat. Refer to Section 6 for detailed 
information and examples of stack management and register usage. 


The registers names and assigned functions are listed in Table 3-1. Following 
the table, the function of each register or group of registers will be briefly de- 
scribed. Refer to Section 4 for detailed information on each of the CPU reg- 
isters. 
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Table 3-1. CPU Registers 


REGISTER ASSIGNED FUNCTION 
NAME 


Extended-precision register 0 
Extended-precision register 1 
Extended-precision register 2 
Extended-precision register 3 
Extended-precision register 4 
Extended-precision register 5 
Extended-precision register 6 
Extended-precision register 7 


Auxiliary register 0 
Auxiliary register 1 

Auxiliary register 2 
Auxiliary register 3. 
Auxiliary register 4 
Auxiliary register 5 


Auxiliary register 6 
Auxiliary register 7 


Data page pointer 
Index register 0 
Index register 1 
Block size 

System stack pointer 


Status register 

CPU/DMA interrupt enable 
CPU interrupt flags 

1/0 flags 


Repeat start address 
Repeat end address 
Repeat counter 


PC Program Counter 


The extended-precision registers (RO-R7) are capable of storing and 
supporting operations on 32-bit integer and 40-bit floating-point numbers. 
Any instruction that assumes the operands are floating-point numbers uses 
bits 39-0. If the operands are either signed or unsigned integers, only bits 
31-0 are used, bits 39-32 remain unchanged. This is true for all shift oper- 
ations. Refer to Section 4 for extended-precision register formats for float- 
ing-point and integer numbers. 


The 32-bit auxiliary registers (ARO-AR7) can be accessed by the CPU and 
modified by the two Auxiliary Register Arithmetic Units (ARAUs). The primary 
function of the auxiliary registers is the generation of 24-bit addresses. They 
can also be used to perform a variety of functions, such as loop counters or 
as 32-bit general-purpose registers that can be modified by the multiplier and 
ALU. Refer to Section 6 for detailed information and examples of the use of 
auxiliary registers in addressing. 


The data page pointer (DP) is a 32-bit register. The eight LSBs of the data 
page pointer are used by the direct addressing mode as a pointer to the page 
of data being addressed. Data pages are 64 k words long with a total of 256 
pages. 


Architectural Overview - Central Processing Unit (CPU) 


The 32-bit index registers (IRO and IR1) are used by the Auxiliary Register 
Arithmetic Unit (ARAU) for indexing the address. Refer to Section 6 for ex- 
amples of the use of index registers in addressing. 


The 32-bit block size register (BK) is used by the ARAU in circular ad- 
dressing to specify the data block size. 


The system stack pointer (SP) is a 32-bit register that contains the address 
of the top of the system stack. The SP always points to the last element 
pushed onto the stack. A push performs a preincrement and a pop, a postde- 
crement of the system stack pointer. The SP is manipulated by interrupts, 
traps, Calls, returns, and the PUSH and POP instructions. Refer to Section 6.5 
for information about system stack management. 


The status register (ST) contains global information relating to the state 
of the CPU. Typicaily, operations set the condition flags of the status register 
according to whether the result is zero, negative, etc. This includes register 
load and store operations as well as arithmetic and logical functions. When 
the status register is loaded, however, a bit-for-bit replacement is performed 
on the current contents with the contents of the source operand regardless of 
the state of any bits in the source operand. Therefore, following a load, the 
contents of the status register are identically equal to the contents of the 
source operand. This allows the status register to be easily saved and restored. 
See Table 4.2 for a list and definitions of the status register bits. 


The CPU/DMA interrupt enable register (IE) is a 32-bit register. The 
CPU interrupt enable bits are in locations 10-0. The DMA interrupt enable 
bits are in Iocations 26-16. A 1 in a CPU/DMA interrupt enable register bit 
enables the corresponding interrupt. A 0 disables the corresponding interrupt. 
Refer to Section 4.1 for bit definitions. 


The CPU interrupt flag register (IF) is also a 32-bit register (see Section 
4.1). A1 ina CPU interrupt flag register bit indicates that the corresponding 
interrupt is set. A 0 indicates that the corresponding interrupt is not set. 


The I/O flags register (IOF) controls the function of the dedicated external 
pins, XFO and XF1. These pins may be configured for input or output, and 
they may also be read from and written to. See Section 4.1 for detailed infor- 
mation. 


The repeat counter (RC) is a 32-bit register used to specify the number of 
times a block of code is to be repeated when performing a block repeat. When 
operating in the repeat mode, the 32-bit repeat start address register 
(RS) contains the starting address of the block of program memory to be re- 
peated and the 32-bit repeat end address register (RE) contains the 
ending address of the block to be repeated. 


The program counter (PC) is a 32-bit register containing the address of the 
next instruction to be fetched. Although the PC is not part of the CPU register 
file, it is a register that can be modified by instructions that modify the program 
flow. 
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3.2 Memory Organization 


The total memory space of the TMS320C30 is 16M (million) 32-bit words. 
Program, data, and |/O space are contained within this 16M-word address 
space, thus allowing tables, coefficients, program code, or data to be stored 
in either RAM or ROM. In this way, memory usage can be maximized and 
memory space allocated as desired. 


3.2.1 RAM, ROM, and Cache 
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Figure 3-3 shows how the memory is organized on the TMS320C30. RAM 
blocks 0 and 1 are each 1K x 32 bits. The ROM block is 4K x 32 bits. Each 
RAM and ROM block is capable of supporting two accesses in a single cycle. 
The separate program buses, data buses, and DMA buses allow for parallel 
program fetches, data reads and writes, and DMA operations. For example: 
the CPU can access two data values in one RAM block and perform an ex- 
ternal program fetch in parallel with the DMA loading another RAM block, all 
within a single cycle. 


A 64 x 32-bit instruction cache is provided to store often repeated sections 
of code, thus greatly reducing the number of off-chip accesses necessary. This 
allows for code to be stored off-chip in slower, lower-cost memories. The 
external buses are also freed for use by the DMA, external memory fetches, or 
other devices in the system. 


Refer to Section 4 for detailed information about the memory and instruction 
cache. 
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Figure 3-3. Memory Organization 
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3.2.2 Memory Maps 
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The memory map is dependent upon whether the processor is running in the 
microprocessor mode (MC/MP = 0) or the microcomputer mode (MC/MP = 
1). The memory maps for these modes are very similar (see Figure 3-4). Lo- 
cations 800000h through 801FFFh are mapped to the expansion bus. When 
this region is accessed, MSTRB is active. Locations 802000h through 
SO3FFFh are reserved. Locations 804000h through 805FFFh are mapped to 
the expansion bus. When this region is accessed, [OSTRB is active. Locations 
806000h through 807FFFh are reserved. All of the memory-mapped periph- 
eral registers are in locations 808000h through 8097FFh. In both modes, 
RAM block 0O is located at addresses 809800h through 809BFFh, and RAM 
block 1 is located at addresses 809COOh through 809FFFh. Locations 
80A000h through OFFFFFFh are accessed over the external memory port 
(STRB active). 


In microprocessor mode, the 4K on-chip ROM is not mapped into the 
TMS320C30 memory map. Locations Oh through 3Fh consist of interrupt 
vector, trap vector, and reserved locations, all of which are accessed over the 
external memory port (STRB active). Locations 40h through 7FFFFFh are also 
accessed over the external memory port. | 


In microcomputer mode, the 4K on-chip ROM is mapped into locations Oh 
through OFFFh. There are 192 locations (Oh through BFh) within this block 
for interrupt vectors, trap vectors, and a reserved space. Locations 1000h 
through 7FFFFFh are accessed over the external memory port (STRB active). 


Section 4.2 describes the memory maps in greater detail. The peripheral bus 
map and the vector locations for reset, interrupts, and traps are also given. 
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Figure 3-4. Memory Maps 
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3.2.3 Memory Addressing Modes 


The TMS320C30 supports a base set of general-purpose instructions as well 
as arithmetic-intensive instructions that are particularly suited for digital signal 
processing and other numeric-intensive applications. Refer to Section 6 for 
detailed information on addressing. 


Five groups of addressing modes are provided on the TMS320C30. Six types 
of addressing may be used within the groups, as shown in the following list: 


® General addressing modes: 
~ Register. The operand is a CPU register. 
= Short immediate. The operand is a 16-bit immediate value. 
= Direct. The operand is the contents of a 24-bit address. 
= Indirect. An auxiliary register indicates the address of the operand. 


@ Three-operand addressing modes: 
= Register. Same as for general addressing mode. 
- Indirect. Same as for general addressing mode. 


@ Parallel addressing modes: 
= Register. The operand is an extended-precision register. 
= Indirect. Same as for general addressing mode. 


@ Long-immediate addressing mode. 
= Long immediate. The operand is a 24-bit immediate value. 


Conditional branch addressing modes: 
- Register. Same as for general addressing mode. 
- PC-relative. A signed 16-bit displacement is added to the PC. 


3.2.4 Instruction Set Summary 


Table 3-2 lists the TMS320C30 instruction set in alphabetical order. Each 
table entry shows the instruction mnemonic, description, and operation. Refer 
to Section 11 for a functional listing of the instructions and individual in- 
struction descriptions. 
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Table 3-2. Instruction Set Summary 


|MNEMONIC| DESCRIPTION OPERATION 


number 
sro + Dreg + C > Dreg 
srot_+ stc2 + C + Dreg 


ANDN Bitwise logical-AND with complement Dreg AND src > Dreg 
ANDN3 Bitwise logical-ANDN (3-operand) stc1 AND src2 > Dreg 


ASH 


ASH3 
Branch conditionally (standard) 
Branch conditionally (delayed) 


Arithmetic shift If count > 0: 
(Shift Dreg left by count) ~ Dreg 
Else: 


(Shift Dreg right by |count|) —~ Dreg 


Arithmetic shift (3-operand) If count > 0: 
(Shift src left by count) ~ Dreg 
Else: 


(Shift sre right by |count|) > Dreg 


lf cond = true: 
lf Csrc is a register, Csrc — PC 
if Csrc is a value, Csrc + PC > PC 
Else, PC + 1 > PC 


If cond = true: 
lf Csrc is a register, Csrc ~ PC 
lf Csrc is a value, Csrc + PC + 3 > PC 
Else, PC + 1 > PC 


LEGEND: 
src — general addressing modes Dreg - register address (any register) 
src1 -three-operand addressing modes Rn —- register address (RO-R7) 
src2 -three-operand addressing modes Daddr — destination memory address 
Csrc -conditional-branch addressing modes ARn~ -— auxiliary register n (ARO-AR7) 
Sreg -— register address (any register) addr - 24-bit immediate address (label) 
count — shift value (general addressing modes) cond - condition code (see Section 11) 
SP ~ stack pointer ST ~ status register 
GIE - global interrupt enable register RE — repeat interrupt register 
RM -repeat mode bit RS — repeat start register 
TOS -top of stack PC — program counter 
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Table 3-2. Instruction Set Summary (Continued) 


[MNEMONIC| _ DESCRIPTION OPERATION 
BRO Branch unconditionally (standard) Value ~ PC 
Branch unconditionally (delayed) Value > PC 


CALL Call subroutine PC +1-2 TOS 
Value > PC 


CALLcond Call subroutine conditionally If cond = true: 
PC +174 TOS 
If Csrc is a register, Csrc > PC 
If Csrc is a value, Csrc + PC > PC 

Else, PC + 1 > PC 


CMPF Compare floating-point values Set flags on Rn - src 


CMPF3 Compare floating-point values Set flags on src1 - src2 
(3-operand) 


CMPI3 Compare integers (3-operand) Set flags on src1 - src2 


DBcond Decrement and branch conditionally ARn - 1 > ARn 
(standard) If cond = true and ARn > 0: 
DBcondD 


If Csrc is a register, Csrc > PC 
If Csrc is a value, Csrc + PC ~ PC 
Convert floating-point value to integer Fix (src) > Dreg 
FLOAT Convert integer to floating-point value Float(src) - Rn 


Else, PC + 1 7 PC 
IDLE Idle until interrupt PC +12 PC 
Idle until next interrupt 


ARn - 1 > ARn 
X 
Load floating-point exponent src(exponent) ~ Rn(exponent) 
DF 


Decrement and branch conditionally 
(delayed) 


If Csrc is a register, Csrc ~ PC 
lf Csrc is a value, Csrc + PC + 3 > PC 
Else, PC + 1 > PC 


If cond = true and ARn > 0: 


LEGEND: 
src — general addressing modes Dreg -—- register address (any register) 
src1 -— three-operand addressing modes Rn — register address (RO-R7) 
src2 -three-operand addressing modes Daddr — destination memory address 
Csre -conditional-branch addressing modes ARn-~ -— auxiliary register n (ARO-AR7) 
Sreg - register address (any register) addr - 24-bit immediate address (label) 
count — shift value (general addressing modes) cond - condition code (see Section 11) 
SP — stack pointer ST — status register 
GIE -—- global interrupt enable register RE — repeat interrupt register 
RM ~-repeat mode bit RS — repeat start register 
TOS -top of stack PC — program counter 
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Table 3-2. Instruction Set Summary (Continued) 
|MNEMONIC| DESCRIPTION OPERATION 
LDFcond Load floating-point value If cond = true, src > Rn 
conditionally Else, Rn is not changed 
LDFI Load floating-point value, Signal interlocked operation 
interlocked src > Rn 


ste > Dreg 
LDicond Load integer conditionally If cond = true, src > Dreg 
Else, Dreg is not changed 
LDII Load integer, interlocked Signal interlocked operation 
src — Dreg 


Load floating-point mantissa src(mantissa) ~ Rn(mantissa) 


Logical shift | If count > 0: 


(Dreg left-shifted by count) ~ Dreg 
LSH3 Logical shift (3-operand) 


Else: 
(Dreg right-shifted by |count|) ~ Dreg 


If count > 0: 
(src left-shifted by count) ~ Dreg 

Else: 

(src right-shifted by |count|) ~ Dreg 


MPYF Multiply floating-point values src x Rn > Rn 


MPYF3 Multiply floating-point values src1 x src2 > Rn 
(3-operand) 


NEG! | Nogete never ——=SS~*dt wre eg SSS 
[NOP | No operation ————~*diMity are itspocifed 
[NORM | Normalize floating-point value | Normalize (ere) An 
Tor | bitwise ogical-OR ———~—~S~*di( reg Re > Dee 
Tors | bitwise logical-OR (G-operend) [wot ORs Deg 


LEGEND: 
src — general addressing modes Dreg - register address (any register) 
src1 -—-three-operand addressing modes Rn — register address (RO-R7) 
src2  -— three-operand addressing modes Daddr — destination memory address 
Csrc -conditional-branch addressing modes ARn~ - auxiliary register n (ARO-AR7) 
Sreg - register address (any register) addr - 24-bit immediate address (label) 
count — shift value (general addressing modes) cond - condition code (see Section 11) 
SP ~ stack pointer ST — status register 
GIE  -global interrupt enable register RE — repeat interrupt register 
RM -repeat mode bit RS — repeat start register 
TOS -top of stack PC — program counter 
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Table 3-2. Instruction Set Summary (Continued) 


MNEMONIC| ———SéDESCRIPTION, =| COPERATION, 
[POP _| Popintegerfromstack =| “SP--> Deg 
Push integer on stack 
Push floating-point value on stack 


RETilcond Return from interrupt conditionally If cond = true or missing: 
*"SP--—> PC 
1 — ST (GIE) 

Else, continue 


RETScond lf cond = true or missing: 


*SP-- —~ PC 
Else, continue 


TRND | Round fosting-point value | Round (ere) > Rn 
Tron | Rotateright ——=SS~*~*~w (reg rotated itt bit > Drege 


Repeat block of instructions strc > RE 
1 > ST (RM) 
Next PC —~ RS 


src > RC 
1 > ST (RM) 

Next PC > RS 
Next PC > RE 


Signal interlocked operation 
Wait for interlock acknowledge 
Clear interlock 


Store floating-point value Rn ~ Daddr | 


STFI Store floating-point value, interlocked Rn — Daddr 
Signal end of interlocked operation 


st Steg > Dadar 


STII Store integer, interlocked Sreg ~ Daddr 
Signal end of interlocked operation 


Return from subroutine conditionally 


Repeat single instruction 


Signal, interlocked 


LEGEND: 
src ~ general addressing modes Dreg - register address (any register) 
srcl -three-operand addressing modes Rn — register address (RO-R7) 
src2 -—- three-operand addressing modes Daddr — destination memory address 
Csre -conditional-branch addressing modes ARn~ -— auxiliary register n (ARO-AR7) 
Sreg -— register address (any register) addr - 24-bit immediate address (label) 
count — shift value (general addressing modes) cond -— condition code (see Section 11) 
SP — stack pointer ST — status register 
GIE - global interrupt enable register RE — repeat interrupt register 
RM ~-repeat mode bit RS — repeat start register 
TOS -top of stack PC — program counter 


Architectural Overview - Memory Organization 


Table 3-2. Instruction Set Summary (Continued) 


| MNEMONIC | DESCRIPTION OPERATION 
SUBB Subtract integers with borrow Dreg - src - C > Dreg 


SUBB3 Subtract integers with borrow srci1 - src2 - C > Dreg 
(3-operand) 


SUBC Subtract integers conditionally lf Dreg - src > 0: 
[(Dreg-src) << 1] OR 1 —> Dreg 
Else, Dreg << 1 ~ Dreg 


(3-operand) 


rSWwE | Software interrupt Perform emulator interrupt sequence 


lf cond = true or missing: 
Next PC > * ++ SP 
Trap vector N > PC 
O > ST (GIE) 

Else, continue 


TRAPcond Trap conditionally 


LEGEND: 
src - general addressing modes Dreg - register address (any register) 
src1 -three-operand addressing modes Rn —- register address (RO-R7) 
src2 -three-operand addressing modes Daddr — destination memory address 
Csrc -conditional-branch addressing modes ARn-~ -—- auxiliary register n (ARO-AR7) 
Sreg —- register address (any register) addr -— 24-bit immediate address (label) 
count — shift value (general addressing modes) cond - condition code (see Section 11) 
SP — stack pointer ST — status register 
GIE - global interrupt enable register RE — repeat interrupt register 
RM -repeat mode bit RS — repeat start register 
TOS - top of stack PC — program counter 
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Table 3-2. Instruction Set Summary (Continued) 


| MNEMONIC | DESCRIPTION OPERATION 


PARALLEL ARITHMETIC WITH STORE INSTRUCTIONS 


ABSF Absolute value of a floating-point lsrc2| — dst 
|| STF || src3 > dst2 
ABSI Absolute value of an integer Isrc2| — dst 
{| STI || src3 — dst2 
ADDF3 Add floating-point src1 + src2 — dst1 
|| STF || src3 > dst2 
ADDI3 Add integer src1 + src2 — dst1 
[| STI [| src3 — dst2 
AND3 Bitwise logical-AND src1 AND src2 —> dst1 
{| STI || src3 > dst2 


ASH3 Arithmetic shift If count > 0: 
src2 << count > dst1 
| src3 > dst2 
Else: 
src2 >> |count| ~ dst1 
|| src3 > dst2 


FIX Convert floating-point to integer Fix(src2) — dst1 
|| STI || src3 > dst2 
FLOAT Convert integer to floating-point Float(src2) — dst1 
|| STF || src3 > dst2 
LDF Load floating-point src2 — dst1 
|| STF || src3 —>dst2 
LDI Load integer src2 — dst1 
{| STI {| src3 — dst2 


LSH3 Logical shift If count > 0: 
| STI src2 << count > dst1 
|| src3 — dst2 
Else: 
src2 >> |count| ~ dst1 
|| src3 — dst2 


MPYF3 Multiply floating-point stcl x src2 —> dst1 
|| STF || src3 > dst2 

MPYI3 Multiply integer srcl x src2 — dst1 
{| STI || src3 —~ dst2 

NEGF Negate floating-point O- src2 — dst1 
|| STF || src3 > dst2 


LEGEND: 
src1 — register addr (RO-R7) src2- indirect addr (disp = O, 1, IRO, IR1) 
src3 — register addr (RO-R7) src4 — indirect addr (disp = O, 1, IRO, IR1) 
dst1 — register addr (RO-R7) dst2 — indirect addr (disp = 0, 1, IRO, IR1) 
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Table 3-2. Instruction Set Summary (Concluded) 


| MNEMONIC| DESCRIPTION | OPERATION 
PARALLEL ARITHMETIC WITH STORE INSTRUCTIONS (Concluded) 


NEGI Negate integer O - src2 — dst1 
|| STI || src3 > dst2 
NOT3 Complement src1 — dst1 
\| STI || src3 — dst2 
OR3 Bitwise logical-OR src1 OR src2 — dst! 
|| STI \|sre3 > dst2 
STF Store floating-point src1 — dst1 
|| STF lIsrc3 > dst2 
STI Store integer src1 — dst1 
|| STI || src3 — dst2 
SUBF3 Subtract floating-point srct - src2 — dst1 
[| STF {| src3 —~ dst2 
SUBI3 Subtract integer srci1 - src2 — dst1 
|| STI || src3 — dst2 
XOR3 Bitwise exclusive-OR src1 XOR src2 — dst1 
[| STI | || src3 > dst2 


PARALLEL LOAD INSTRUCTIONS 
LDF Load floating-point src2 — dst1 
|| LDF || src4 — dst2 
LDI Load integer src2 — dst1 
|| LDI || src4 — dst2 


PARALLEL MULTIPLY AND ADD/SUBTRACT INSTRUCTIONS 
MPYF3 Multiply and add floating-point op1 x op2 ~ op3 
\| ADDF3 || op4 + opd > op6 | 
MPYF3 Multiply and subtract floating-point op1 x op2 > op3 
|| SUBF3 || op4 - op5 — op6 
MPYI3 Multiply and add integer op1 x op2 ~ op3 
\| ADDI3 || op4 + op5 > op6 
MPYI3 Multiply and subtract integer op! x op2 ~ op3 
|| SUBI3 : || op4 - op5 > op6 


LEGEND: 
srcl — register addr (RO-R7) src2 — indirect addr (disp = 0, 1, IRO, IR1) 
src3 ~ register addr (RO-R7) src4 — indirect addr (disp = O, 1, IRO, iR1) 
dst1 — register addr (RO-R7) dst2 — indirect addr (disp = 0, 1, IRO, 1R1) 


op3 -— register addr (RO or R1) op6 — register addr (R2 or R3) 


opt,op2,op4,op5 - Two of these operands must be specified using register addr. 
and two must be specified using indirect 


Architectural Overview - Internal Bus Operation 


3.3 Internal Bus Operation 
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A large portion of the TMS320C30’s high performance is due to the internal 
busing and the parallelism possible because of this busing. The separate 
program buses (PADDR and PDATA), data buses (DADDR1, DADDR2, and 
DDATA), and DMA buses (DMAADDR and DMADATA) allow for parallel 
program fetches, data accesses, and DMA accesses. These buses connect all 
of the physical spaces (on-chip memory, off-chip memory, and on-chip pe- 
ripherals) supported by the TMS320C30. 


The program counter (PC) is connected to the 24-bit program address bus 
(PADDR). The instruction register (IR) is connected to the 32-bit program 
data bus (PDATA). These buses can fetch a single instruction word every 
machine cycle. 


The 24-bit data address buses (DADDR1 and DADDR2) and the 32-bit data 
data bus (DDATA) support two data memory accesses every machine cycle. 
The DDATA bus carries data to the CPU over the CPU1 and CPU2 buses. The 
CPU1 and CPU2 buses can carry two data memory operands to the multiplier, 
ALU, and register file every machine cycle. Also internal to the CPU are reg- 
ister buses REG1 and REG2 that can carry two data values from the register 
file to the multiplier and ALU every machine cycle. 


The DMA controller is supported with a 24-bit address bus (DMAADDR) and 
a 32-bit data bus (DMADATA). These buses allow the DMA to perform me- 
mory accesses in parallel with the memory accesses occurring from the data 
and program buses. 
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3.4 External Bus Operation 


The TMS320C30 provides two external interfaces: the primary bus and ex- 
pansion bus. Both consist of a 32-bit data bus and a set of control signals. 
The primary bus has a 24-bit address bus, whereas the expansion bus has a 
13-bit address bus. Both buses can be used to address external program/data 
memory or |/O space. The buses also have an external RDY signal for wait- 
state generation. Additional wait states may be inserted under software con- 
trol. Refer to Section 8 for detailed information on external bus operation. 


The TMS320C30 supports four external interrupts (INT3-INTO), a number of 
internal interrupts, and a nonmaskable external RESET signal. Two external 1/O 
flags, XFO and XF1, can be configured as input or output pins under software 
control. These pins are also used by the interlocked operations of the 
TMS320C30. The interlocked-operations instruction group supports multi- 
processor communication (see Section 7 for examples of the use of inter- 
locked instructions). 
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3.5 Peripherals 
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All TMS320C30 peripherals are controlled through memory mapped registers 
on a dedicated peripheral bus, composed of a 32-bit data bus and a 24-bit 
address bus. This peripheral bus permits straightforward communication to 
the peripherals. The TMS320C30 peripherals include two timers and two se- 
rial ports. Figure 3-5 shows the peripherals with associated buses and signals. 
Refer to Section 9 for detailed information on the peripherals. 
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Figure 3-5. Peripheral Modules 
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3.5.1 Timers 


The two timer modules are general-purpose 32-bit timer/event counters, with 
two signaling modes and internal or external clocking. Each timer has an |/O 
pin that can be used as an input clock to the timer or as an output signal dri- 
ven by the timer. The pin may also be configured as a general-purpose |/O 


pin. 
3.5.2 Serial Ports 


The two serial ports are totally independent. They are identical with a com- 
plementary set of contro! registers controlling each one. Each serial port can 
be configured to transfer 8, 16, 24, or 32 bits of data per word. The clock for 
each serial port can originate either internally or externally. An internally 
generated divide-down clock is provided. The serial port pins are configurable 
as general-purpose I/O pins. The serial ports can also be configured as timers. 
A special handshake mode allows TMS320C30s to communicate over their 
serial ports with guaranteed synchronization. 
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3.6 Direct Memory Access (DMA) 


The on-chip Direct Memory Access (DMA) controller can read from or write 
to any location in the memory map without interfering with the operation of 
the CPU. Therefore, the TMS320C30 can interface to slow external memories 
and peripherals without reducing throughput to the CPU. The DMA controller 
contains its own address generators, source and destination registers, and 
transfer counter. Dedicated DMA address and data buses allow for minimi- 
zation of conflicts between the CPU and the DMA controller. A DMA opera- 
tion consists of a block or single-word transfer to or from memory. Refer to 
Section 9 for detailed information on the DMA. Figure 3-6 shows the DMA 
controller with associated buses. 
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Figure 3-6. DMA Controller 


In summary, the TMS320C30 is a powerful DSP system because of its inte- 
gration of a powerful CPU, large memories, and sufficient buses to support its 
speed. These along with peripherals such as a DMA controller, two serial 
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Architectural Overview - Direct Memory Access (DMA) 


ports, and two timers are all contained on a single chip. The total system real 
estate and price have been reduced, providing the user with a true single-chip 
solution. 
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Section 4 


CPU Registers, Memory, and Cache 


The CPU register file contains 28 registers that can be operated upon by the 
multiplier and ALU (arithmetic logic unit). Included in the register file are the 
auxiliary registers, extended-precision registers, and index registers. The reg- 
isters in the CPU register file support addressing, floating-point/integer oper- 
ations, stack management, processor status, block repeats, and interrupts. 


The TMS320C30 provides a total memory space of 16M (million) 32-bit 
words. Program, data, and |/O space are contained within this 16M-word 
address space. Two RAM blocks of 1K x 32 bits each and a ROM block of 4K 
x 32 bits permit two accesses in a single cycle. The memory maps for the 
microcomputer and microprocessor modes are similar, except that the on-chip 
ROM is not used in microprocessor mode. 


A 64 x 32-bit instruction cache stores often repeated sections of code. This 
greatly reduces the number of off-chip accesses necessary and allows code to 
be stored off-chip in slower, lower-cost memories. Three bits are provided in 
the CPU status register to control the clear, enable, or freeze of the cache. 


This section describes in detail each of the CPU registers, the memory maps, 
and the instruction cache. Major topics in this section are as follows: 


e CPU Register File (Section 4.1 on page 4-2) 
Extended-precision registers (RO-R7) 
= Auxiliary registers (ARO-AR7) 
= Index registers (IRO, IR1) 
- Block size register (BK) 
- Data page pointer (DP) 
= System stack pointer (SP) 
- Status register (ST) 
= CPU/DMA interrupt enable register (IE) 
- CPU interrupt flag register (IF) 
= 1/O flags register (IOF) 
a Repeat counter (RC) and block repeat registers (RS, RE) 
7 Program counter (PC) 


®@ Memory (Section 4.2 on page 4-11) 
= Memory maps 
aa Peripheral bus map 
= Reset/interrupt/trap map 


® Instruction Cache (Section 4.3 on page 4-15) 
= Cache architecture 
= Cache algorithm 
= Cache control bits 
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4.1 CPU Register File 
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The TMS320C30 provides 28 registers in a multiport register file that is tightly 
coupled to the CPU. The PC is not included in the 28 registers. All of these 
registers can be operated upon by the multiplier and ALU, and can be used 
as general-purpose 32-bit registers. However, the registers also have some 
special functions for which they are more suited than others. For example, the 
eight extended-precision registers are especially suited for maintaining ex- 
tended-precision floating-point results. The eight auxiliary registers support 
a variety of indirect addressing modes and can be used as general-purpose 
32-bit integer and logical registers. The remaining registers provide system 
functions such as addressing, stack management, processor status, interrupts, 
and block repeat. Refer to Section 6 for detailed information and examples 
of the use of CPU registers in addressing. 


The registers names and assigned function are listed in Table 4-1. 


Table 4-1. CPU Registers 


REGISTER ASSIGNED FUNCTION 
NAME 


Extended-precision register 0 
Extended-precision register 1 
Extended-precision register 2 
Extended-precision register 3 
Extended-precision register 4 
Extended-precision register 5 
Extended-precision register 6 
Extended-precision register 7 


Auxiliary register 0 
Auxiliary register 1 
Auxiliary register 2 
Auxiliary register 3 
Auxiliary register 4 
Auxiliary register 5 
Auxiliary register 6 
Auxiliary register 7 


Data page pointer 
Index register 0 
Index register 1 
Block size 

System stack pointer 


Status register 
CPU/DMA interrupt enable 
CPU interrupt flags 

1/0 flags 


Repeat start address 
Repeat end address 
Repeat counter 


—— Program counter 
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4.1.1 Extended-Precision Registers (RO-R7) 


The eight extended-precision registers (RO-R7) are capable of storing and 
supporting operations on 32-bit integer and 40-bit floating-point numbers. 
These registers consist of two separate and distinct regions. Bits 39-32 of the 
extended-precision registers are dedicated to the storage of the exponent (e) 
of the floating-point number. Bits 31-0 store the mantissa of the floating- 
point number. Bit 31 is the sign (s) bit, bits 30 - O are the fraction (f). Any 
instruction that assumes the operands are floating-point numbers uses bits 
39-0. Figure 4-1 illustrates the storage of 40-bit floating-point numbers in the 


extended-precision registers. 4 
39 32 3130 0 
Fea eel 

- mantissa | 


Figure 4-1. Extended-Precision Register Floating-Point Format 


For integer operations, bits 31-0 of the extended-precision registers contain 
the integer (signed or unsigned). Any instruction that assumes the operands 
are either signed or unsigned integers uses only bits 31-0. Bits 39-32 remain 
unchanged. This is true for all shift operations. The storage of 32-bit integers 
in the extended-precision registers is shown in Figure 4-2. 


39 32 31 | 0 


signed or unsigned integer 


Figure 4-2. Extended-Precision Register Integer Format 


4.1.2 Auxiliary Registers (ARO-AR7) 


The eight 32-bit auxiliary registers (ARO-AR7) can be accassed by the CPU 
and modified by the two Auxiliary Register Arithmetic Units (ARAUs). The 
primary function of the auxiliary registers is the generation of 24-bit addresses. 
However, they can also be used to perform a variety of functions, such as loop 
counters in indirect addressing or as 32-bit general-purpose registers that can 
be modified by the multiplier and ALU. Refer to Section 6 for detailed infor- 
mation and examples of the use of auxiliary registers in addressing. 


CPU Registers - CPU Register File 


4.1.3 Data Page Pointer (DP) 


The data page pointer (DP) is a 32-bit register. The eight LSBs of the data 
page pointer are used by the direct addressing mode as a pointer to the page 
of data being addressed. Data pages are 64 k words long with a total of 256 
pages. Bits 31 - 8 are reserved and should always be kept zero by the user. 


4.1.4 Index Registers (IRO, IR1) 


The 32-bit index registers (1RO and IR1) are used by the Auxiliary Register 
Arithmetic Unit (ARAU) for indexing the address. Refer to Section 6 for de- 
tailed information and examples of the use of index registers in addressing. 


4 4.1.5 Block Size Register (BK) 


The 32-bit block size register (BK) is used by the ARAU in circular addressing 
to specify the data block size (see Section 6.3). 


4.1.6 System Stack Pointer (SP) 


The system stack pointer (SP) is a 32-bit register that contains the address of 
the top of the system stack. The SP always points to the last element pushed 
onto the stack. The SP is manipulated by interrupts, traps, calls, returns, and 
the PUSH, PUSHF, POP, and POPF instructions. Pushes and pops of the 
stack perform pre-increment and post-decrement on all 32 bits of the stack 
pointer. However, only the 24 LSBs are used as an address. Refer to Section 
6.5 for information about system stack management. 


4.1.7 Status Register (ST) 
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The status register (ST) contains global information relating to the state of the 
CPU. Typically, operations set the condition flags of the status register ac- 
cording to whether the result is zero, negative, etc. This includes register load 
and store operations as well as arithmetic and logical functions. When the 
status register is loaded, however,a bit-for-bit replacement is performed of the 
current contents with the contents of the source operand regardless of the 
state of any bits in the source operand. Therefore, following a load, the con- 
tents of the status register are identically equal to the contents of the source 
operand. This allows the status register to be easily saved and restored. At 
system reset, 0 is written to this register. 


The format of the status register is shown in Figure 4-3. Table 4-2 defines the 
status register bits, their names and functions. 
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31 29 24 21 19 18 17 
es ietateiaietaia oc iaete alee 


14 13 #12 = «171 


Ca Po [ore [co [ce [er Tm Tae fever Tore 2 ve 


R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/WR/W 


NOTE: xx = reserved bit. 
R = read, W = write. 


Figure 4-3. Status Register 
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Table 4-2. Status Register Bits Summary 


Ter[Name]——~=S*~SCTION 
Oe 
Ti [ v_[oveowfag OSOSOSCSCSC“‘S*S~S~S~S 
Fs 
Ta [uF _| Flosting-point underiow fag SCSCSC~S 
Ts [wv | tatched overtiowfig 
[e[ we [thes oxen naeow fog 


Overflow mode flag. This flag affects only the integer operations. If 
OVM = 0, the overflow mode is turned off; integer results that over- 
flow are treated in no special way. If OVM = 1, integer results over- 
flowing in the positive direction are set to the most positive 32-bit 
two’s-complement number (7FFFFFFFh). If OVM = 1, integer results 
overflowing in the negative direction are set to the most negative 
32-bit two’s-complement number (8Q0000000h). Note that the func- 
tion of V and LV is independent of the setting of OVM. 


Repeat mode flag. If RM = 1, the PC is being modified in either the 
repeat block or repeat-single mode. 


Cache Freeze. When CF = 1, the cache is frozen. If the cache is en- 
abled (CE = 1), fetches from the cache are allowed, but no modifica- 
tion of the state of the cache is performed. This function can be used 
to save frequently used code resident in the cache. At reset, O is writ- 
ten to this bit. Cache clearing (CC=1) is allowed when CF=0. 


Cache Enable. CE = 1 enables the cache, allowing the cache to be 
used according to the LRU cache algorithm. CE = O disables the 
cache; no update or modification of the cache can be performed. No 
fetches are made from the cache. This function is useful for system 
debug. At system reset, 0 is written to this bit. Cache clearing (CC 
= 1) is allowed when CE=0. 


Cache Clear. CC = 1 invalidates all entries in the cache. This bit is 
always cleared after it is written to and thus always read as O. At reset, 
0 is written to this bit. 


Global interrupt enable. If GIE = 1, the CPU responds to an enabled 
interrupt. If GIE = 0, the CPU does not respond to an enabled inter- 
rupt. 


14-19 Reserved| Read asO. 
16-31 Reserved| Value undefined. 
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4.1.8 CPU/DMA Interrupt Enable Register (IE) 


The CPU/DMA interrupt enable register (IE) is a 32-bit register (see Figure 
4-4). The CPU interrupt enable bits are in locations 10-0. The DMA interrupt 
enable bits are in locations 26-16. A1 in a CPU/DMA interrupt enable reg- 
ister bit enables the corresponding interrupt. A O disables the corresponding 
interrupt. At reset, O is written to this register. Table 4-3 defines the register 
bits, the bit names, and the bit functions. 


31 30 29 28 27 24 


EDINT] ETINT1 | ETINTO]ERINT1| EXINT1| ERINTOLEXINTO|EINT3 | EINT2| EINT1] EINTO 
(DMA)| (DMA) | (DMA) (DMA) | (DMA) | (DMA) | (DMA) |(DMA)|(DMA)|(DMA)| (DMA) as 


R/W R/W R/W R/W R/W R/W R/W R/W- R/W- R/W- R/W 


14 13 12 11 


2 1 0 
EDINT]ETINT1{ ETINTO|ERINT1} EXINT1]/ ERINTO| EXINTO] EINT3 | EINT2| EINT1 | EINTO 
(CPU)} (CPU) | (CPU) | (CPU) | (CPU) | (CPU) | (CPU) | (CPU)}| (CPU) | (CPU) ] (CPU) 
R/W R/W R/W- R/W- R/W R/W R/W R/W- R/W- R/W~ R/W 


NOTE: xx = reserved bit, read as 0. 
R = read, W = write. 


Figure 4-4. CPU/DMA Interrupt Enable Register (IE) 
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Table 4-3. IE Register Bits Summary 


Cer [ wame | FUNGTION. 
To [_einro | enable extemal nterupt 0 (CP) 
[4 [exinto [Enable serial port 0 transmit interrupt (CPU) 
[5 [ERINTO | Enable serial port O receive interrupt (CPU) 
Te [exINT! | Enable serial port 1 transmit interrupt (CPU) 
[7 [eRints [Enable serial port t receive interupt (CPU) | 
Ts _[erinto | Enable timer Ointerupt (CPU) 
[9 [erintt | Enable timer t iterupt (CPU) 
[10 [_eDINT | Enable DMA controller inert (CPU) | 
rit-tb| Resoved | Value undefined 
[16 [einro | Enable external intowupt 0 (DMA) 


27-32 | Reserved | Value undefined 


4.1.9 CPU Interrupt Flag Register (IF) 


The 32-bit CPU interrupt flag register (IF) is shown in Figure 4-5. A1 ina 
CPU interrupt flag register bit indicates that the corresponding interrupt is set. 
The IF bits are set to 1 when an interrupt occurs. They may also be set to 1 
through software to cause an interrupt. A O indicates that the corresponding 
interrupt is not set. If a O is written to an interrupt flag register bit, the corre- 
sponding interrupt is cleared. At reset, 0 is written to this register. Table 4-4 
lists the bit fields, bit field names, and bit field functions of the CPU interrupt 
flag register. 
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31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 
pc J xx | xx | vox fx] vo Pv Pee Pv we ee Yee Ye fx | 


15 14 13 12 11 10 


9 8 7 6 5 4 3 2 1 0 
px | 0 J 90 J 9 | xx [DINT] TINTH | TINTO] RINT? [ XINT# | RINTO] XINTO}INTS]INT2] INT: INTO 


R/W R/W R/W R/W- R/W- R/W- R/W_R/W R/W R/W R/W 


NOTE: xx = reserved bit, read as 0. 
R = read, W = write. 


Figure 4-5. CPU Interrupt Flag Register (IF) 


Table 4-4. IF Register Bits Summary 


Cer [name ——~SCiwucTION 
To [into | extemal neruptOfieg 
[1 [intt [extemal interupt flag 
[2 [inte [External intorupt2 fag 
[2 [xinto [Serial port 0 transmit intewupt fag 
[5 [ Into | Serial port 0 receive interupt flag 
[6 [xinrt | Serial port + tensmitinterupt flag 
Te_[ tino | Timer interuptflag 
To [rin | Timer interupt flag 


4.1.10 I/O Flags Register (IOF) 


The I/O flags register (IOF) controls the function of the dedicated external 
pins, XFO and XF1. These pins may be configured for input or output (see 
Table 4-5). They may also be read from and written to. At reset, 0 is written 
to this register. The bit fields, bit field names, and bit field functions are shown 
in Table 4-5. 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 
XX XX 


xx] xx] xx} xxfex fox fof] x |e |x Px Pe x 


15 14 13 12 11 10 9 8 7 


6 5 4 3 2 1 0 
| af 0} 0c | | | cfc inxes LouTxrt [T/oxet] xa] inxeo | ourxFO|7/0xFO} x | 
R R/W R/W R R/W R/W 


NOTE: xx = reserved bit, read as O. 
R = read, W = write. 


Figure 4-6. I/O Flag Register (IOF) 
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Table 4-5. IOF Register Bits Summary 


Tait] NAME[——~S~C~*~<“~*~*~SNGTINNSSC~*S 
0 [Reserved] ReadesQOOSC—~—SC“‘; COCs” 


1/OXFO | If 1/OXFO = 0, XFO is configured as a general-purpose input pin. 
lf |/OXFO = 1, XFO is configured as a general-purpose output pin. 


OUTXFO | Data output on XFO. 
INXFO Data input on XFO. A write has no effect. 
| 4 |Reserved| ReadasO. 


es T/OXF1 | If 1/OXF1 = 0, XF1 is configured as a general-purpose input pin. 


If 1/OXF1 = 1, XF1 is configured as a general-purpose output pin. 


| 6 | OUTXF1 | Data output on XF1. 
INXF1 Data input on XF1. A write has no effect. 


4.1.11 Repeat Counter (RC) and Block Repeat Registers (RS, RE) 


The repeat counter (RC) is a 32-bit register used to specify the number of 
times a block of code is to be repeated when performing a block repeat. 


The repeat start address register (RS) is a 32-bit register containing the start- 
ing address of the block of program memory to be repeated when operating 
in the repeat mode. 


The 32-bit repeat end address register (RE) contains the ending address of the 
block of program memory to be repeated when operating in the repeat mode. 


4.1.12 Program Counter (PC) 


The program counter (PC) is a 32-bit register containing the address of the 
next instruction to be fetched. While the program counter is not part of the 
CPU register file, it is a register that can be modified via instructions that mo- 
dify the program flow. 


4.1.13 Reserved Bits and Compatibility 


In order to retain compatibility with future members of the TMS320C3X family 
of microprocessors, reserved bits that are read as zero must be written as zero. 
Reserved bits that have an undefined value must not have their current value 
modified. In other cases, the user should maintain the reserved bits as speci- 
fied. 
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4.2 Memory 


The total memory space of the TMS320C30 is 16M (million) 32-bit words. 
Program, data, and I/O space are contained within this, allowing tables, co- 
efficients, program code, or data to be stored in either RAM or ROM. In this 
way, memory usage can be maximized and memory space allocated as desired. 


RAM blocks O and 1 are each 1K x 32 bits. The ROM block is 4K x 32 bits. 
Each RAM and ROM block is capable of supporting two accesses in a single 
cycle. The separate program buses, data buses, and DMA buses allow for 
parallel program fetches, data reads/writes, and DMA operations. This is cov- 
ered in detail in Section 10.3. 40 


4.2.1 Memory Maps 


The memory map is dependent upon whether the processor is running in the 
microprocessor mode (MC/MP = 0) or the microcomputer mode (MC/MP = 
1). The memory maps for these modes are very similar (see Figure 4-7). Lo- 
cations 800000h through 801FFFh are mapped to the expansion bus. When 
this region is accessed, MSTRB is active. Locations 802000h through 
SO3FFFh are reserved. Locations 804000h through 805FFFh are mapped to 
the expansion bus. When this region is accessed, IOSTRB is active. Locations 
806000h through 807FFFh are reserved. All of the memory-mapped periph- 
eral registers are in locations 808000h through 8097FFh. In both modes, 
RAM block 0 is located at addresses 809800h through 809BFFh, and RAM 
block 1 is located at addresses 809COOh through 809FFFh. Memory locations 
80A000h through OFFFFFFh are accessed over the external memory port 
(STRB active). 


In microprocessor mode, the 4K on-chip ROM is not mapped into the 
TMS320C30 memory map. Locations Oh through 3Fh consist of interrupt 
vector, trap vector, and reserved locations, all of which are accessed over the 
external memory port (STRB active). Locations 40h through 7FFFFFh are also 
accessed over the external memory port. 


In microcomputer mode, the 4K on-chip ROM is mapped into locations Oh 
through OFFFh. There are 192 locations (Oh through BFh) within this block 
for interrupt vectors, trap vectors, and a reserved space. Locations 1000h 
through 7FFFFFh are accessed over the external memory port (STRB active). 


Reserved portions of the TMS320C30 memory space and reserved peripheral 
bus addresses should not be read and written by the user. Doing so may 
cause the TMS320C30 to halt operation and require a system reset to restart. 
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Oh 


Oh 
INTERRUPT LOCATIONS INTERRUPT LOCATIONS 
BNE RESERVED Wi 2c! AND RESERVED (192) 
EXTERNAL STRB ACTIVE 
BFh BFh 
COh COh 
STRB ACTIVE 1000h 
EXTERNAL 
STRB ACTIVE 
7FFFFEh 7FFFFFh 
800000h 800000h 
EXPANSION BUS EXPANSION BUS 
S64 Ker: MSTRB ACTIVE (8K) 801FFEh MSTRB ACTIVE (8K) 
802000h 802000h 
RESERVED RESERVED 
SOREEER (8K) SO3FFFh (8K) 
804000h 804000h 
EXPANSION BUS EXPANSION BUS 
1OSTRB ACTIVE (8K) IOSTRB ACTIVE (8K) 
SO5FFFh SOSFFFh 
806000h 806000h 
RESERVED RESERVED 
807FFFh an! 807FFFh ic 
808000h 808000h 
| PERIPHERAL BUS PERIPHERAL BUS 
MEMORY-MAPPED MEMORY-MAPPED 
REGISTERS REGISTERS 
8097FFh (INTERNAL) (6K) 8097FFh (INTERNAL) (6K) 
809800h 809800h 
RAM BLOCK 0 (1K) RAM BLOCK 0 (1K) 
(INTERNAL) (INTERNAL) 
SO9BFFh SO9BFFH 
809C00h 809CO00h | 
RAM BLOCK 1 (1K) RAM BLOCK 1 (1K) 
(INTERNAL) (INTERNAL) 
SO9FFFh SO9FFFh 
80A000h 80A000h 


OFFFFFFh 


EXTERNAL 
STRB ACTIVE 


OFFFFFFh 


EXTERNAL 
STRB ACTIVE 


MICROPROCESSOR MODE MICROPROCESSOR MODE 


Figure 4-7. Memory Maps 
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4.2.2 Peripheral Bus Map 


The memory-mapped peripheral registers are located starting at address 
808000h. The peripheral bus memory map is shown in Figure 4-8. Each pe- 
ripheral occupies a 16-word region of the memory map. Locations 808010h 
through 80801 Fh and locations 808070h through 8097FFh are reserved. 


808000h DMA CONTROLLER REGISTERS 
80800Fh (16) 
808010h RESERVED 
80801 Fh (16) 
808020h TIMER O REGISTERS 
80802Fh (16) 
808030h TIMER 1 REGISTERS 
80803Fh (16) 
808040h SERIAL PORT 0 REGISTERS 
80804Fh (16) 
808050h SERIAL PORT 1 REGISTERS 
80805Fh (16) 
808060h PRIMARY AND EXPANSION PORT 
80806Fh REGISTERS (16) 
808070h 

RESERVED 
8097FFh 


Figure 4-8. Peripheral Bus Memory Map 


4.2.3 Reset/Interrupt/Trap Vector Map 


The addresses for the reset, interrupt, and trap vectors are Oh through 3Fh, as 
shown in Figure 4-9. The vectors stored in these locations are the addresses 
of the start of the respective reset, interrupt, and trap routines. For example, 
at reset, the contents of memory location Oh (the reset vector) are loaded into 
the PC and execution begins from that address. 


Traps 28-31 are reserved and should not be used by the user. 
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00h 
Oth 
Ozh 
0h 
Oth 
06h 
06h 
O7h 
08h 
09h 
OA 
0B 
RESERVED 

1Fh 

20h 
36h 
3ch 
30h 
3th 
3Fh 


Figure 4-9. Reset, Interrupt, and Trap Vector Locations 
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4.3 Instruction Cache 


A 64 x 32-bit instruction cache allows for maximum system performance with 
minimal system cost. The instruction cache stores sections of code that can 
be fetched when repeatedly accessing time-critical code. This greatly reduces 
the number of off-chip accesses necessary and allows for code to be stored 
off-chip in slower, lower-cost memories. The external buses are also freed 
from program fetches, so they can be used by the DMA or other system ele- 
ments. 


The cache can operate in a completely automatic fashion without the need for 
user intervention. A form of the LRU (least-recently-used) cache update al- 
gorithm is used (see Section 4.3.2). 40 


4.3.1 Cache Architecture 


The instruction cache (see Figure 4-10) contains 64 32-bit words of RAM. 
The cache is divided into two 32-word segments. Associated with each seg- 
ment is a 19-bit segment start address (SSA) register. For each word in the 
cache, there is a corresponding single-bit: Present (P) flag. 


SEGMENT START 


ADDRESS REGISTERS P SEGMENT WORDS LRU 
—_—_— SACS 7 STACK __ MOST RECENTLY USED 
SEGMENT NUMBER 
SSA REGISTER 0 SEGMENT WORD 0 
ae ees ee SEGMENT WORD 1 LEAST RECENTLY USED 
3 ; SEGMENT 0 SEGMENT NUMBER 
| 30 SEGMENT WORD 30 
| 31) SEGMENT WORD 31 
+ 32 ——__+| 
SSA REGISTER 1 | 0 | SEGMENT WORD 0 
ca SEGMENT WORD 1 
: ; SEGMENT 1 
| 30 | SEGMENT WORD 30 
| 31 SEGMENT WORD 31 


Figure 4-10. Instruction Cache Architecture 


When the CPU requests an instruction word from external memory, a check 
is made to determine if the word is already contained in the instruction cache. 
The partitioning of an instruction address as used by the cache control algo- 
rithm is shown in Figure 4-11. The 19 most-significant bits of the instruction 
address are used to select the segment and the 5 least-significant bits define 
the address of the instruction word within the pertinent segment. The 19 
MSBs of the instruction address are compared with the two segment start 
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address (SSA) registers. if a match is found, a check is made of the relevant 
P flag. The P flag indicates whether or not the word within a particular seg- 
ment is already present in cache memory. 


23 54 | 0 
segment start address instruction word 
| (SSA) address within segment 


Figure 4-11. Address Partitioning for Cache Control Algorithm 


If there is no match, one of the segments must be replaced by the new data. 
The segment replaced in this circumstance is determined by the LRU (least- 
recently-used) algorithm. The LRU stack (see Figure 4-10) is maintained for 
this purpose. 


The LRU stack tracks which of the two segments qualifies as the least-re- 
cently-used after each access to the cache, therefore the stack contains either 
0,1 or 1,0. Each time a segment is accessed, its segment number is removed 
from the LRU stack and pushed on the top of the LRU stack. Therefore, the 
number at the top of the stack is the most-recently-used segment number and 
the number at the bottom of the stack is the least-recently-used segment 
number. 


At system reset, the LRU stack is initialized with O at the top, 1 at the bottom, 
and all P flags in the instruction cache are cleared. If both SSA registers are 
equal (due to system reset conditions) and a cache hit occurs, the instruction 
word is fetched from the most recently used segment. 


When a replacement Is necessary, the least-recently-used segment is selected 
for replacement. Also, the 32 P flags for the segment to be replaced are set 
to 0, and the segment’s SSA register is replaced with the 19 MSBs of the in- 
struction address. ,* 


4.3.2 Cache Algorithm 


When the TMS320C30 requests an instruction word from external memory, 
two possible actions occur: a cache hit or a cache miss. These are described 
in the following list: 


@ Cache Hit. The requested instruction is contained within the cache 
and the following actions occur: 

1) The instruction word is read from the cache. 

2) The segment number of the segment within which the word is 
contained is removed from the LRU stack and pushed to the top 
of the LRU stack, thus moving the other segment number to the 
bottom of the stack. 


@ Cache Miss. The instruction is not contained in the cache. Types of 
cache miss are: 
1) Word Miss. The segment address register matches the instruction 
address, but the relevant P flag is not set. The following actions 
occur in parallel: 
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= The instruction word is read from memory and copied into 
the cache. 

—- The segment number of the segment within which the word 
is contained is removed from the LRU stack and pushed to 
the top of the LRU stack, thus moving the other segment 
number to the bottom of the stack. 

= The relevant P flag is set. 

2) Segment Miss. Neither of the segment addresses matches the in- 
struction address. The following actions occur in parallel: 

- The least-recently-used segment is selected for replacement. 
The P flags for all 32 words are cleared. 

- The SSA register for the selected segment is loaded with the 
19 MSBs of the address of the requested instruction word. 

= The instruction word is fetched and copied into the cache. 
It goes into the appropriate word of the least-recently-used 
segment. The P flag for that word is set 1. 

—- The segment number of the segment containing the instruc- 
tion word is removed from the LRU stack and pushed to the 
top of the LRU stack, thus moving the other segment number 
to the bottom of the stack. 


Only instructions may be fetched from the program cache. All reads and writes 
of data in memory bypass the cache. Program fetches from internal memory 
do not modify the cache and will not generate cache hits or misses. The pro- 
gram cache is a single-access memory block. Dummy program fetches (i.e., 
following a branch) are treated by the cache as valid program fetches and can 
generate cache misses and cache updates. 


Care should be taken when using self-modifying code. If an instruction re- 
sides in cache and the corresponding location in primary memory is modified, 
the copy of the instruction in cache is not modified. 


More efficient use of the cache can be made by aligning program code on 32 
word address boundaries. This can be done using the ALIGN directive when 
coding assembly language. 


4.3.3 Cache Control Bits 


Three cache control bits are located in the CPU status register: the cache clear 
bit (CC), cache enable bit (CE), and the cache freeze bit (CF). 


Cache Clear Bit (CC). Writing a1 to the cache clear bit (CC) invalidates 
all entries in the cache. All P flags in the cache are cleared. The CC bit is al- 
ways Cleared after the cache is cleared. It is therefore always read as a0. At 
reset the cache is cleared and 0 is written to this bit. 


Cache Enable Bit (CE). Writing a1 to this bit enables the cache. When 
enabled, the cache is used according to the previously described cache algo- 
rithm. Writing a 0 to the cache enable bit disables the cache; no updates or 
modification of the cache can be performed. Specifically, no SSA register 
updates are performed, no P flags are modified (unless CC = 1), and the LRU 
stack is not modified. Writing a 1 to CC when the cache is disabled will clear 
the cache, and thus the P flags. No fetches are made from the cache when the 
cache is disabled. At reset, 0 is written to this bit. 
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Cache Freeze Bit (CF). When CF = 1, the cache is frozen. If, in addition, 
the cache is enabled, fetches from the cache are allowed, but no modification 
of the state of the cache is performed. Specifically, no SSA register updates 
are performed, no P flags are modified (unless CC = 1), and the LRU stack is 
not modified. This function can be used to keep frequently used code resident 
in the cache. Writing a 1 to CC when the cache is frozen will clear the cache, 
and thus the P flags. At reset, 0 is written to this bit. 
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Table 4-6 defines the effect of the CE and CF bits used in combination. 
Table 4-6. Combined Effect of the CE and CF Bits 


pce | cr EFFECT 


Cache not enabled 
Cache not enabled 
Cache enabled and not frozen 
Cache enabled and frozen 
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RS 


Section 5 


Data Formats and Floating-Point Operation 


Data is organized in the TMS320C30 architecture to provide three funda- 
mental data types: integer, unsigned-integer, and floating-point. Note that the 
terms, integer and signed-integer, are considered to be equivalent. The 
TMS320C30 supports short and single-precision formats for signed and un- 
signed integers. It also supports short, single-precision and extended- 
precision formats for floating-point data. 


Floating-point operations provide convenient and trouble-free computations 
while maintaining accuracy and precision. The TMS320C30 implementation 
of floating-point arithmetic allows for floating-point operations at integer 
speeds. The floating-point capability can prevent problems with overflow, 
operand alignment, and other burdensome tasks common in integer oper- 
ations. 


This section discusses in detail the data formats and floating-point operations 
supported on the TMS320C30. Major topics in this section are as follows: 


®@ Integer Formats (Section 5.1 on page 5-2) 

Unsigned-Integer Formats (Section 5.2 on page 5-3) 

Floating-Point Formats (Section 5.3 on page 5-4) 

Floating-Point Multiplication (Section 5.4 on page 5-9) 
Floating-Point Addition and Subtraction (Section 5.5 on page 5-13) 
Normalization (Section 5.6 on page 5-17) 

Rounding (Section 5.7 on page 5-20) 


Floating-Point to Integer Conversions (Section 5.8 on page 5-22) 


Integer to Floating-Point Conversions (Section 5.9 on page 5-24) 
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5.1 Integer Formats 


The TMS320C30 supports two integer formats: a 16-bit short integer format 
and a 32-bit single-precision integer format. When extended-precision regis- 
ters are used as integer operands only bits 31-0 are used; bits 39-32 remain 
unchanged and unused. 


5.1.1 Short Integer Format 


The short integer format is a 16-bit two’s-complement integer format, used for 
immediate integer operands. For those instructions that assume integer oper- 
ands, this format is sign-extended to 32 bits (see Figure 5- ae The ree of 
an integer s/, represented in the short integer format, is -215 < sj < 2! 

In Figure 5-1, s=signed bit. 


15 0 


aaa: 


Short Integer Format 
31 16 15 0 


Sign Extension of a Short Integer 


Figure 5-1. Short Integer Format and Sign Extension of Short 
integer 


5.1.2 Single-Precision Integer Format 


In the single-precision integer format, the integer is represented in two’s- 
complement notation. The ba of an integer sp, represented in the single- 
precision integer format, is - < sp s 2°' -1. Figure 5-2 shows the 
single-precision integer erie 


31 0 


ee 


Figure 5-2. Single-Precision Integer Format 
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5.2 Unsigned-Integer Formats 


Two unsigned-integer formats are supported on the TMS320C30: a 16-bit 
short format and a 32-bit single-precision format. In extended-precision reg- 


isters, the unsigned-integer operands use only bits 31-0; bits 39-32 remain 
unchanged. 


5.2.1 Short Unsigned-Integer Format 


Figure 5-3 shows the 16-bit short unsigned-integer format, used for immedi- 
ate unsigned-integer operands. For those instructions that assume un- 
signed-integer operands, this format is zero-filled to 32 bits. In Figure 5-3 
below, X = MSB (1 or Q). 


15 0 


[| 
Short Unsigned- Integer 5s 


Format 
31 16 15 0 


0 0000000000000 0 


Zero Fill of a Short Unsigned Integer 
Figure 5-3. Short Unsigned-Integer Format and Zero Fill 


5.2.2 Single-Precision Unsigned-Integer Format 


In the single-precision unsigned-integer format, the number is represented as 
a 32-bit value, as shown in Figure 5-4. 


31 0 


Figure 5-4. Single-Precision Unsigned-Integer Format 
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5.3 Floating-Point Formats 


All TMS320C30 floating-point formats consist of three fields: an exponent 
field (e), a single sign-bit field (s), and a fraction field (f). These are stored 
as shown in Figure 5-5. The exponent field is a two’s-complement number. 
The sign field and fraction field may be considered as one unit and referred to 
as the mantissa field (man). The mantissa is used to represent a normalized 
two’s-complement number. In a normalized representation, a most-signifi- 
cant nonsign bit is implied, thus providing an additional bit of precision. The 
value of a floating-point number x as a function of the fields e, s, and fis given 
as 


x= 01f x 2° ifs =O 
10.f x 2° ifs = 1 
0 if e = most negative two’s-complement value 


for the specified exponent field width. 


cS ae eee 
= man (mantissa) eee 


Figure 5-5. Generic Floating-Point Format 


Three floating-point formats are supported on the TMS320C30. The first is a 
short floating-point format for immediate floating-point operands, consisting 
of a 4-bit exponent, 1 sign bit, and an 11-bit fraction. The second is a sin- 
gle-precision format consisting of an 8-bit exponent, 1 sign bit, and a 23-bit 
fraction. The third is an extended-precision format consisting of an 8-bit ex- 
ponent, 1 sign bit, and a 31-bit fraction. 


5.3.1 Short Floating-Point Format 
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In the short floating-point format, floating-point numbers are represented by 
a two’s-complement 4-bit exponent field (e) and a two’s-complement 12-bit 
mantissa field (man) with an implied most-significant nonsign bit. 


15 12|11|10 o 


| man 


Figure 5-6. Short Floating-Point Format 


Data Formats - Floating-Point Formats 


Operations are performed with an implied binary point between bits 11 and 
10. When the implied most-significant nonsign bit is made explicit, it is lo- 
cated to the immediate left of the binary point. The floating-point two’s- 
complement number x in the short floating-point format is given by 


x= O1.F x 2° ifs =O 
10.f x 2° ifs =1 
0 ife = -8,s =0,f=0 


The following reserved values must be used to represent zero in the short 
floating-point format: 


ing-point format: 
Most Positive: x=(2-211) x» 27=2.5594 x 102 
Least Positive: x=1x 2-7 =7.8125 x 10-3 
Least Negative: x =(-1-2°1!) x 2-7 = -7.8163 x 1073 
Most Negative: x =-2 x 2/ = -2.5600 x 102 


The following examples illustrate the range and precision of the short float- on 
be) 


5.3.2 Single-Precision Floating-Point Format 


In the single-precision format, the floating-point number is represented by an 
8-bit exponent field (e) and a two’s-complement 24-bit mantissa field (man) 
with an implied most-significant nonsign bit. 


Operations are performed with an implied binary point between bits 23 and 
22. When the implied most-significant nonsign bit is made explicit, it is lo- 
cated to the immediate left of the binary point. The floating-point number x 


is given by: 
x= 01.f x 2° ifs =O 
10.f x 2° ifs = 1 
0 ife = -128,s =0,f=0 
31 24 | 23| 22 ) 


|-———— man ———-» 


Figure 5-7. Single-Precision Floating-Point Format 


The following reserved values must be used to represent zero in the single- 
precision floating-point format: 


e@ = - 
s=0 
f=0 


5-5 


Data Formats - Floating-Point Formats 


The following examples illustrate the range and precision of the single-preci- 
sion floating-point format. 
Most Positive: x = (2- 2°23) x2127 = 3.4028234 x 1038 
Least Positive: x=1 x 2127 = 58774717 x 10-99 
Least Negative: x = (-1-2-23) x 2-127 = -§.8774724 x 10°39 
Most Negative: x =-2 x 2127 = -3.4028236 x 10 38 


5.3.3 Extended-Precision Floating-Point Format 


In the extended-precision format, the floating-point number is represented by 
an 8-bit exponent field (e) and a 32-bit mantissa field (man) with an implied 
most-significant nonsign bit. 


Operations are performed with an implied binary point between bits 31 and 
30. When the implied most-significant nonsign bit is made explicit, it is lo- 
cated to the immediate left of the binary point. The floating-point number x 
is given by 


x= O1f x 2° ifs =0 
10.f x 2° ifs =1 
0 ife = -128,s =0,f=0 
39 32|31| 30 0 


a oo 


Figure 5-8. Extended-Precision Floating-Point Format 


The following reserved values must be used to represent zero in the extend- 
ed-precision floating-point format: 


= -128 
s=0 
f=0 


The following examples illustrate the range and precision of the extended- 
precision floating-point format: 
Most Positive: = (2- 2°31) x2127 = 34028236683 x 1038 
Least Positive: x=1x 2127 = 58774717541 x 10-99 
Least Negative: x = (-1-2°31) x 2-127 = -5.8774717569 x 10°99 
Most Negative: x =-2 x 2127 = -3.4028236691 x 10 38 
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5.3.4 Conversion Between Floating-Point Formats 


Floating-point operations assume several different formats for inputs and 
outputs. These formats often require conversion from one floating-point for- 
mat to another (e.g., short floating-point format to extended-precision float- 
ing-point format). Format conversions automatically occur in hardware, with 
no overhead, as a part of the floating-point operations. The four conversions 
are shown below with examples of the conversion. When a floating-point 
format zero is converted to a greater-precision format, it is always converted 
to a valid representation of zero in that format. In the below figures, S = sign 
bit of the exponent. 


@ Short floating-point format conversion to single-precision 
floating-point format. 


15 12 11 10 0 
Short Floating-Point Format 
31 27 24 23 22 12 11 0 


Single-Precision Floating-Point Format 


In this format, the exponent field is sign-extended and the fraction field 
filled with zeros. 


® Short floating-point format conversion to extended-precision 
floating-point format. 


15 1211 10 0 
y 


Short Floating-Point Format 
39 35 32 31 30 20 19 0 


Extended -Precision Floating-Point Format 


The exponent field in this format is sign-extended and the fraction field 
filled with zeros. 
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@ Single-precision floating-point format conversion to extend- 
ed-precision floating-point format. 


31 24 23 22 0 


pot fy 


Single-Precision Floating-Point Format 
39 32 31 30 8 7 


0 
pot fy vf 


Extended-Precision Floating-Point Format 


The fraction field is filled with zeros. 


@ Extended-precision floating-point format conversion to sin- 
gle-precision floating-point format. 


39 32 31 30 8 7 0 


ety fy fz 


Extended-Precision Floating-Point Format 


31 24 23 22 0 


ty fy 


Single-Precision Floating-Point Format 


The fraction field is truncated. 
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5.4 Floating-Point Multiplication 


A floating-point number a can be written in floating-point format as the fol- 
lowing formula, where a( man) is the mantissa and a(exp) is the exponent. 


a = a(man) x 28(exp) 
The product of a and b is c, defined as 
c =a x b= a(man) x b(man) x 2 (a(exp)+b (exp)) 


c(man) = a(man) x b(man) 
c(exp) = a(exp) + b(exp) 


When performing floating-point multiplication, source operands are always 
assumed to be in the single-precision floating-point format. If the source of 
the operands is in short floating-point format, it is extended to the single- 5 
precision floating-point format. If the source of the operands is in extend- 
ed-precision floating-point format, it is truncated to single-precision format. 
These conversions automatically occur in hardware with no overhead. All re- 
sults of floating-point multiplications are in the extended-precision format. 
These multiplications occur in a single cycle. 


A flowchart for floating-point multiplication is shown in Figure 5-9. In step 
1, the 24-bit source operand mantissas are multiplied, producing a 50-bit re- 
sult c(man). (Note that input and output data are always represented as nor- 
malized numbers.) In step 2, the exponents are added, yielding c(exp). Steps 
3 through 6 check for special cases. Step 3 checks for whether c(man) in 
extended-precision format is equal to zero. If c(man) is zero, step 7 sets 
c(exp) to -128, thus yielding the representation for zero. 


Steps 4 and 5 normalize the result. If a right shift of one is necessary, then in 
step 8, c(man) is right-shifted one bit and one is added to c(exp). If a right 
shift of two is necessary, then in step 9, c(man) is right-shifted two bits and 
two is added to c(exp). Step 6 occurs when the result is normalized. 


In step 10, c(man) is set in the extended-precision floating-point format. 
Steps 11 through 18 check for special cases of c(exp). In step 14, if c(exp) 
has overflowed (step 11) in the positive direction, then c(exp) is set to the 
most-positive extended-precision format value. If c(exp) has overflowed in 
the negative direction, then c(exp) is set to the most-negative extended -pre- 
cision format value. If c(exp) has underflowed (step 12), then c is set to zero 
(step 15); i.e., c(man) = 0 and c(exp) = -128. 
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a(man) b(man) a(exp) b(exp) 
| (1) (2) 
Multiply mantissas Add exponents 


c(man) = a(man) * b(man) _ 
(50-bit result) c(exp) = a(exp) + b(exp) 


Test for special cases of c(man) 
4) (5) (6) 
Right—shift 1 Right—shift 2 No shift 


3 
c(man) = 0 ‘to normalize to normalize _—to normalize 


c(exp) = noe >1 een) : > 2 
— 128 Pa Bie 
c(exp) + 1 c(exp) + 2 


Dispose of extra bits (10) 


Put oe) in extended 
recision floating-point 
ormat. 


Test for special cases of c(exp) 


(11) (12) (13) 
c(exp) overflow c(exp) underflow  c(exp) in range 


If c(man) > 0, 
set c to most 
positive value. 

lf c(man) < 0, 

set c to most 

negative value. 


c(exp) = —128 
c(man) = 0 


Set c to final result (16) 
c=a*b 


Figure 5-9. Flowchart for Floating-Point Multiplication 


Floating-Point Operations - Multiplication 


The following examples illustrate how floating-point multiplication is per- 
formed on the TMS320C30. For these examples, the implied most-significant 
nonsign bit is made explicit. 


Example 5-1. Floating-Point Multiply (Both Mantissas = -2.0) 


Let 


-2.0 x 24(€XP) = 40.00000000000000000000000 x 2 a(exP) 
-2.0 x 25(€XP) = 10.00000000000000000000000 x 2 b(exP) 


a 
b 


where a and b are both represented in binary form according to the normalized single-pre- 
cision floating-point format. Then 


10.00000000000000000000000 x 22(¢xP) 
x  10.00000000000000000000000 x 25(exP) 


0100.0000000000000000000000000000000000000000000000 x 2(a( exp) + b(exp)) 


To place this number in the proper normalized format, it is necessary to shift the mantissa two 
places to the right and add two to the exponent. This yields 


10.00000000000000000000000 x 2@(exP) 
x 10.00000000000000000000000 x 25(exP) 


01 .CODDDDD0D00D0D0000000000000000000000000000000000 x 2(a( exp) + b(exp)+2) 
In floating-point multiplication, the exponent of the result may overflow. This can occur 


when the exponents are initially added or when the exponent is modified during normaliza- 
tion. 


Example 5-2. Floating-Point Multiply (Both Mantissas = 1.5) 


Let 


5 x 23(€XP) = 01.10000000000000000000000 x 2 afexP) 
5 x 25(€xXP) = 01.10000000000000000000000 x 2 >(exp) 


a= 1 
b=1 
where a and b are both represented in binary form according to the single-precision float- 
ing-point format. Then 


01.10000000000000000000000 x 28(¢xP) 
x 01.10000000000000000000000 x 2(exP) 


0010.01 DO0ODDDD0000000000000000000000000000000000000 x 2 (alexp) + b(exp)) 


To place this number in the proper normalized format, it is necessary to shift the mantissa 
one place to the right and add one to the exponent. This yields 


01.10000000000000000000000 x 24(6xP) 
x 01.10000000000000000000000 x 2>(exP) 


01.001 0D00D00000000000000000000000000000000000000000 x 2 (a(exp) +b(exp) +1) 
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Example 5-3. Floating-Point Multiply (Both Mantissas = 1.0) 
Let 


= 1.0 x24(€XP) = 01.00000000000000000000000 x 2 (exp) 
= 1.0 x2>(€XP) = 01.00000000000000000000000 x 2 >(exp) 


where a and b are both represented in binary form according to the single-precision float- 
ing-point format. Then 


01.00000000000000000000000 x 23(€XP) 
x 01.00000000000000000000000 x 25(exP) 


0001 .COODDD00000000000000000000000000000000000000000 x 2 (alexp) + b(exp)) 


This number is in the proper normalized format. Therefore, no shift of the mantissa or mod- 
ification of the exponent is necessary. 


These examples have shown cases where the product of two normalized numbers can be 
normalized with a shift of zero, one, or two. For all normalized inputs with the floating-point 
format used by the TMS320C30, a normalized result can be produced by a shift of zero, one, 
or two. 


Example 5-4. Floating-Point Multiply Between Positive and Negative Numbers 
Let 


a= 1.0 x24(€XP) = 01.00000000000000000000000 x 2 a(exP) 
b = 2.0 x2>(€XP) = 10.00000000000000000000000 x 2 (exp) 


Then 


01.00000000000000000000000 x 22(6XxP) 
x 10.00000000000000000000000 x 25(exP) 


4110.0000000000000000000000000000000000000000000000 x 2 (a(exp) + b(exp)) 
The result is = -20 x 2(a(exp) + b(exp)) 


Example 5-5. Floating-Point Multiply by Zero 


All multiplications by a floating-point zero yield a result of zero (f = 0, s = QO, and exp = 
-128). 


Floating-Point Operations - Addition/Subtraction 


5.5 Floating-Point Addition and Subtraction 


In floating-point addition and subtraction, two floating-point numbers a and 
b can be defined as 

a = a(man) x 28(€Xp) 

b = b(man) x 2>(exp) 


The sum (or difference) of a and b can be defined as 


a+b 

(a(man) + (b(man) x 2 ~(a(exP)-b(exp)))) x 2 alexp), 
if a(exp) > b(exp) 

= ((a(man) x 2 ~((6xP)-alexP))) + b( man)) x 2 PLEXP), 
if a(exp) < b(exp) 


Cc 


The flowchart for floating-point addition is shown in Figure 5-10. Since this 
flowchart assumes signed data, it is also appropriate for floating-point sub- 
traction. In this figure, it is assumed that a(exp) < b(exp). In step 1, the 
source exponents are compared, and c(exp) is set equal to the largest of the 
two source exponents. In step 2, d Is set to the difference of the two expo- 
nents. In step 3, the mantissa with the smallest exponent, in this case 
a(man), is right-shifted d bits in order to align the mantissas. After the man- 
tissas have been aligned, they are added (step 4). 


Steps 5 through 7 check for a special case of c(man). If c(man) is zero (step 
5), then c(exp) is set to its most-negative value (step 8) to yield the correct 
representation of zero. If c(man) has overflowed c (step 6), then in step 9, 
c(man) is right-shifted one bit and one is added to c(exp) In step 10, the 
result is normalized. In steps 11 and 12, special cases of c(exp) are tested. 
If c(exp) has overflowed, then c is set to the most-positive extended- precision 
value if it is positive; otherwise, it is set to the most-negative extended-preci- 
sion value. 


Floating-Point Operations - Addition/Subtraction 


a(man) b(man) a(exp) b(exp) 


) 


If a(exp) <= b(exp) 
c(exp) = b(exp) 
else 
c(exp) = a(exp) 
[Assume for simplicity 
that alexp) << = blexp)]) 


2) 
d = b(exp) — a(exp) 


@ 


a(man) = a(man) >> d 


Discard LSBs 
to keep a(man) in 
extended-precision 
floating-point format 


(4) Add mantissas 


c(man) = a(man) + b(man) 


Test for special cases of c(man) 


(6) (7) 
k =# leading 
c(man) = 0 Overflow of c(man) non-significant 


sign bits 


(5) 


¢e(man) = c(man) >> 1 
c(exp) = c(exp) + 1 
Discard LSBs to keep in 
extended-precision 
floating-point format 


(9) 


c(man) << k 
c(exp) = c(exp) -—k 


Test for special cases of c(exp) 


(11) (12) (13) 
c(exp) overflow c(exp) underflow c(exp) in range 


(14) 


set c to zero 
c(exp) = —128 
c(man) = 0 


If c(man) > 0, 
set c to most 
positive value, 

If c(man) < 0, 

set c to most 

negative value. 


(16) 


Set c to final result 


c=a+b 


Figure 5-10. Flowchart for Floating-Point Addition 


Floating-Point Operations - Addition/Subtraction 


The following examples describe the floating-point addition and subtraction 
Operations. It is assumed that the data is in the extended-precision floating- 
point format. 


Example 5-6. Floating-Point Addition 


In the case of two normalized numbers to be summed, let 


1.5 = 01.1000000000000000000000000000000 x 2° 
0.5 = 01.0000000000000000000000000000000 x 2°! 


a 
b 
It is necessary to shift b to the right by one so that a and b have the same 
exponent. This yields 


b = 0.5 = 00.1000000000000000000000000000000 x 2° 
Then 


01.1000000000000000000000000000000 x 2° 
+ 00.1000000000000000000000000000000 x 2° 


010.0000000000000000000000000000000 x 2° 
As in the case of multiplication, it is necessary to shift the binary point one 
place to the left and to add one to the exponent. This yields 


01.1000000000000000000000000000000 x 2° 
+ 00.1000000000000000000000000000000 x 2° 


01.0000000000000000000000000000000 x 2! 
Example 5-7. Floating-Point Subtraction 


A subtraction is performed in this example. Let 


a = 01.0000000000000000000000000000001 x 2° 
b = 01.0000000000000000000000000000000 x 2° 


The operation to be performed is a- b. The mantissas are already aligned since 


the two numbers have the same exponent. The result is a large cancellation 
of the upper bits, as shown below. 


01.0000000000000000000000000000001 x 2° 
01.0000000000000000000000000000000 x 2° 


00.0000000000000000000000000000001 x 2° 
The result must be normalized. In this case, a left-shift of 31 is required. The 
exponent of the result is modified accordingly. The result is 


01.0000000000000000000000000000001 x 2° 
01.0000000000000000000000000000000 x 29 


01.0000000000000000000000000000000 x 2°31 


Floating-Point Operations - Addition/Subtraction 


Example 5-8. Floating-Point Addition with a 32-Bit Shift 


This example illustrates a situation where a full 32-bit shift is necessary to 
normalize the result. Let 


a = 01.1119111111111111111111111111111 x 2127 
b = 10.0000000000000000000000000000000 x 2127 
The operation to be performed is a + b. 


O1.1711914191191191111111111111111111 x 2127 
+ _10.0000000000000000000000000000000 x 2'27 


44.1919111111919119191111111111111 x 2127 


Normalizing the result requires a left-shift of 32 and a subtraction of 32 from 


the exponent. The result is 
01.1111111111111111111111111111111 x 2!27 
+ 10.0000000000000000000000000000000 x 2127 
10.0000000000000000000000000000000 x 29° 


Example 5-9. Floating-Point Addition/Subtraction and Zero 


When floating-point addition and subtraction is performed with a floating- 
point O, , the following identities are satisfied: 


Floating-Point Operations - Normalization 


5.6 Normalization Using the NORM Instruction 


The NORM instruction takes an extended-precision floating-point number, 
assumed to be unnormalized, and normalizes it. Since the number is assumed 
to be unnormalized, no implied most-significant nonsign bit is assumed. The 
NORM instruction executes the following three steps: 


1) Locate the most-significant nonsign bit of the floating-point number. 
2)  Left-shift to normalize the number. 
3) Adjust the exponent. 


Given the extended-precision floating-point value a to be normalized, the 
normalization (norm ()) is performed as shown in Figure 5-11. 


Floating-Point Operations - Normalization 


Test for special cases of a(man) 


2 
(1) Leading non-significant 
a(man) = 0 _ sign bits. 


k =# leading 
non-significant 
sign bits 


(3)) . (4) 


Sign-extended a(man) 1 bit 
c(man) = a(man) << k 
c(exp) = a(exp) — k 


5 
Remove most-significant non-sign bit e) 


(exp) = —128 


Test for special cases of c(exp) 


(6) (7) 


c(exp) underflow  c(exp) in range 


c(exp) = —128 
No change to c(man) 


(9) 


Set c to final result 


c = norma) 


Figure 5-11. Flowchart for NORM Instruction Operation 


Floating-Point Operations - Normalization 


Example 5-10. NORM Instruction 


Assume that an extended-precision register contains the value 
man = 00000000000000000001 000000000001, exp = 0 


When the normalization is performed on a number assumed to be unnormal- 
ized, the binary point is assumed to be 


man = 0.0000000000000000001000000000001, exp = 0 


This number is then sign-extended one bit so that the mantissa contains 33 
bits. 


man = 00.0000000000000000001 000000000001, exp = 0 


The intermediate result after the most-significant nonsign bit is located and 
the shift performed is 


man = 01.000000000001 0000000000000000000, exp = -19 


The final 32-bit value output after removing the redundant bit is 
man = 0000000000001 O000000000000000000, exp = -19 


The NORM instruction is useful for counting the number of leading zeros or 
leading ones in a 32-bit field. If the exponent is initially zero, the absolute 
value of the final value of the exponent is the number of leading ones or zeros. 
This instruction is also useful for manipulating unnormalized floating-point 
numbers. 


Floating-Point Operations - Rounding 


5.7 Rounding: The RND Instruction 
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The RND instruction rounds a number from the extended-precision floating- 
point format to the single-precision floating-point format. Rounding is similar 
to floating-point addition. Given the number 2 to be rounded, the following 
operation is performed first. 


c = a(man) x 23(€XP) + (1 x 2 (alexp)-24)) 
Next a conversion from extended-precision floating-point to single-precision 


floating-point format is performed. Given the extended-precision floating- 
point value, the rounding (rnd()) is performed as shown in Figure 5-12. 


Floating-Point Operations - Rounding 


a(exp) —24 


a 1x2 


Test for special cases of c(man) 


No special case 


c(man) = 0 Overflow of c(man) 


c(man) = c(man) >> 1 
c(exp) = a(exp) + 1 


If c(man) > 0, 
set c to most- 
positive single- 
precision value. 

If c(man) < 0, 
set c to most- 

negative single- 

precision value. 


Set 8 LSBs of c(man) to zero 


c = rnd(a) 


Figure 5-12. Flowchart for Floating-Point Rounding by the RND 
Instruction 
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5.8 Floating-Point to Integer Conversion 
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Floating-point to integer conversion, using the FIX instructions, allow ex- 
tended-precision floating-point numbers to be converted to single-precision 
integers in a single cycle. The floating-point to integer conversion of the value 
x will be referred to here as fix(x). The conversion will not overflow if a, the 
number to be converted, is in the range: 


-231 <a < 231-4 


First, it is necessary to be certain that 


a(exp) < 30 


If these bounds are not met, an overflow occurs. If an overflow occurs in the 
positive direction, the output is the most positive integer. If an overflow oc- 
curs in the negative direction, the output is the most negative integer. If 
a(exp) is within the valid range, then a(man), with implied bit included, is 
sign-extended and right-shifted (rs) by the amount 


rs = 31 - a(exp) 


This right-shift (rs) shifts out those bits corresponding to the fractional part 
of the mantissa. For example: 


IfO < x < 1, then fix(x) = 0. 
If -1 < x < 0, then fix(x) = -1. 


The flowchart for the floating-point to integer conversion is shown in Figure 
5-13. 


Floating-Point Operations - Conversion to Integer 


Test for special cases of a(exp) 


a(exp) in range 


a(exp) > 30 rs = 31 — a(exp) 


If a(man) > 0, 
c = most positive 
integer. 

If a(man) < 0, 
c = most negative 
integer. 


Set c to final result 


c = fix(a) 


Figure 5-13. Flowchart for Floating-Point to Integer Conversion 
by FIX Instructions 


Floating-Point Operations - Conversion to Floating-Point 


5.9 Integer to Floating-Point Conversion Using the FLOAT 
Instruction 
Integer to floating-point conversion, using the FLOAT instruction, allows sin- 


gle-precision integers to be converted to extended-precision floating-point 
numbers. The flowchart for this conversion is shown in Figure 5-14. 


c(man) 
c(exp) 


Test for special cases of c(man) 


Leading non-significant 
sign bits. 


c(man) = 0 


k =# leading 
non-significant 
sign bits 


c(man) << k 
30 — k 


Remove most significant 
non-sign bit. 


Set c to final result 


c = float (a) 


Figure 5-14. Flowchart for Integer to Floating-Point Conversion 
Using the FLOAT Instruction 
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Addressing 


The TMS320C30 supports five groups of powerful addressing modes. Six 
types of addressing may be used within the groups, which allow access of 
data from memory, registers, and the instruction word. This section details the 
operation, encoding, and implementation of the addressing modes. Also dis- 
cussed is the management of system stacks, queues, and deques in memory. 
The major topics in this section are: 


® Types of Addressing (Section 6.1 on page 6-2) 
= Register 
- Direct 
7 Indirect 
= Short-immediate 
= Long-immediate 
= PC-relative 


@ Groups of Addressing Modes (Section 6.2 on page 6-18) 
= General addressing modes 
= Three-operand addressing modes 
= Parallel addressing modes 
= Long-immediate addressing mode 
= Conditional-branch addressing modes 


@ Circular Addressing (Section 6.3 on page 6-22) 
& Bit-Reversed Addressing (Section 6.4 on page 6-26) 


8 System Stack Management (Section 6.5 on page 6-27) 
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6.1 Types of Addressing 


Six types of addressing allow access of data from memory, registers, and the 
instruction word. They are: 


Register 

Direct 

Indirect 
Short-immediate 
Long-immediate 
@ PC-relative 


Some types of addressing are appropriate for some instructions and not others. 
For this reason, the types of addressing are used in the five different groups 
of addressing modes as follows: 


e General addressing modes (G): 
= Register 
= Direct 
- Indirect 
- Short-immediate 


® Three-operand addressing modes (T): 
= Register 
= Indirect 


e Parallel addressing modes (P): 
= Register 
- Indirect 


@ Long-immediate addressing mode 
= Long -immediate 


@ Conditional-branch addressing modes (B): 
a Register 
~ PC-relative 


The six types of addressing will be discussed first, followed by the five groups 
of addressing modes. 


6.1.1 Register Addressing 
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In register addressing, the operand is contained in a CPU register, as shown 
in the example below. 


ABSF Rl ; Rl = |{R1| 


The syntax for the CPU registers, the assembler syntax, and the assigned 
function for those registers are listed in Table 6-1. 


Addressing - Types of Addressing 


Table 6-1. CPU Register/Assembler Syntax and Function 


CPU REGISTER |ASSEMBLER ASSIGNED 
ADDRESS SYNTAX FUNCTION 


Extended-precision register 
Extended-precision register 
Extended-precision register 
Extended-precision register 
Extended-precision register 
Extended-precision register 
Extended-precision register 
Extended-precision register 


Auxiliary register 
Auxiliary register 
Auxiliary register 
Auxiliary register 
Auxiliary register 
Auxiliary register 
Auxiliary register 
Auxiliary register 


Data page pointer 
Index register 0 
Index register 1 
Block size 

Active stack pointer 


Status register 

CPU/DMA interrupt enable 
CPU interrupt flags 

1/O flags 


Repeat start address 
Repeat end address 
Repeat counter 


6.1.2 Direct Addressing 


In direct addressing, the data address is formed by the concatenation of the 
eight least-significant bits of the data page pointer (DP) with the 16 least- 
significant bits of the instruction word (expr). This results in 256 pages (64 
K words per page), giving the programmer a large address space without 
needing to change the page pointer. The syntax and operation for direct ad- 
dressing are listed below. 


Syntax: @expr 
Operation: address = DP concatenated with expr 


Figure 6-1 shows the formation of the data address. Example 6-1 gives an 
instruction example with data before and after instruction execution. 
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31 16 15 Oo 
Instruction 
word Pe 
31 8 7 0 
ee] ee 
31 24 23 0 
31 '¢] 


Figure 6-1. Direct Addressing 


6 Example 6-1. Direct Addressing 


ADDI @OBCDEh,R7 


Before Instruction: After Instruction: 

DP = 8Ah DP = 8Ah 

R7 = Oh R7 = 12345678h 

Data at 8ABCDEh = 12345678h Data at SABCDEh = 12345678h 


6.1.3 Indirect Addressing 
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Indirect addressing is used to specify the address of an operand in memory 
through the contents of an auxiliary register, optional displacements, and in- 
dex registers. Only the 24 least-significant bits of the auxiliary registers and 
index registers are used in indirect addressing. This arithmetic is performed 
by the auxiliary register arithmetic units (ARAUs) on these lower 24 bits and 
is unsigned. The upper eight bits are unmodified. 


The flexibility of indirect addressing is possible because the ARAUs on the 
TMS320C30 are used to modify auxiliary registers in parallel with operations 
within the main CPU. Indirect addressing is specified by a five-bit field in the 
instruction word, referred to as the mod field. A displacement is either an ex- 
plicit unsigned 8-bit integer contained in the instruction word or an implicit 
displacement of one. Two index registers, IRO and IR1, can also be used in 
indirect addressing. In some cases, an addressing scheme using Circular or 
bit reversed addressing is optional. The mechanism for generating addresses 
in circular addressing is discussed in Section 6.3, bit reversed in Section 6.4. 


Table 6-2 lists the various kinds of indirect addressing, along with the value 
of the modification (mod) field, assembler syntax, operation, and function for 
each. The succeeding 18 examples show the operation for each kind of indi- 
rect addressing. 
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Table 6-2. Indirect Addressing 


MOD FIELD. SYNTAX | OPERATION | __DESCRIPTION.—_—*s 


*++ARn(disp) addr = ARn + disp | With pre-dispiacement add and modify 
ARn = ARn + disp 

*--ARn(disp) addr = ARn - disp With pre-displacement subtract and 
ARn = ARn - disp modify 

*“ARn++(disp) addr = ARn With post-displacement add and 
ARn = ARn + disp modify 

*ARn-- (disp) addr = ARn With post-displacement subtract and 
ARn = ARpn - disp modify 

“ARn++(disp)% | addr = ARn With post-displacement add and 

ARn = circ(ARn + disp)] circular modify 

*“ARn--(disp)% addr + ARn With post-displacement subtract and 
ARn = circ(ARn - disp) | circular modify 


INDIRECT ADDRESSING WITH INDEX REGISTER IRO 


01000 | *+ARn(IRO) addr = ARn + RO With pre-index (IRO) add 
| 01001 | *-ARn(IRO) addr = ARn - IRO With pre-index (I1RO) subtract 


01010 *++ARn(IRO) addr = ARn + IRO With pre-index (IRQ) add and modify 
ARn = ARn + IRO 
01011 *--ARn(IRO) addr = ARn - IRO With pre-index (IRQ) subtract and 
ARn = ARn - IRO modify 
*ARn++(IRO) addr = ARn With post-index (IRO) add and modify 
| ARn = ARn + {RO 
01101 *ARn--(1IROQ) addr = ARn With post-index (I1RO) subtract and 
ARn = ARn - IRO modify 
*ARn++(IRO)% addr = ARn With post-index (1RO) add and 
ARn = circ(ARn + IRQ) | circular modify 
01111 *“ARn--(1RO)% addr = ARn With post-index (IRO) subtract and 
ARn = circ(ARn - IRO) | circular modify 


LEGEND 
addr = memory address 
ARn~ = auxiliary register ARO - AR7 
IRn = jndex register |RO or IR1 
disp = displacement 
ae = add and modify 
-- = subtract and modify 
circ() = address in circular addressing 
% = where circular addressing is performed 
B = where bit-reversed addressing is performed 
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Table 6-2. Indirect Addressing (Concluded) 


*++ARn(1IR1) addr = ARn + IR1 With pre-index (IR1) add and modify 
ARn = ARn + IR1 
10011 *--ARn(IR1) addr = ARn - IR‘ With pre-index (1R1) subtract and 
ARn = ARn - [R1 modify 
*ARn++(1R1) addr = ARn With post-index (1R1) add and modify 
ARn = ARn + IR1 | 
*“ARn--(IR1) addr = ARn With post-index (IR1) subtract and 
ARn = ARn - IR1 modify 
*ARn++(IR1)% addr = ARn With post-index (IR1) add and 
ARn = circ(ARn + IR1) | circular modify 
10111 *“ARn--(1R1)% addr = ARn With post-index (IR1) subtract and 
ARn = circ(ARn - IR1) | circular modify 


INDIRECT ADDRESSING (SPECIAL CASES) 


[11000 [*ARn—_| adr = AR 


11001 *ARn++(IRO)B addr = ARn With post-index (IRO) add and 
ARn = B(ARn + IRO) bit-reversed modify 

LEGEND: 

addr = memory address 

ARn~ = auxiliary register ARO - AR7 

IRn = index register {RO or IR1 

disp = displacement 

ate = add and modify 

-- = subtract and modify 

circ() = address in circular addressing 

% = where circular addressing is performed 

B = where bit-reversed addressing is performed 
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Example 6-2. Auxiliary Register Indirect 


The address of the operand to be fetched is contained in an auxiliary register (ARn). 


Operation: operand address = ARn 
Assembler Syntax: *ARn 
Modification Field: 11000 

31 24 23 0) 


ARn 


a a 


31 0 
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Example 6-3. Indirect with Pre-Displacement Add 


The address of the operand to be fetched is the sum of an auxiliary register (ARn) and the 
displacement (disp). The displacement is either an eight-bit unsigned integer contained in 
the instruction word or an implied value of 1. 


Operation: operand address = ARn+disp 
Assembler Syntax: *+ARn(disp) 
Modification Field: 00000 
31 24 23 0 
31 8 7 0 
disp | 0 0...0 ’ ‘ 


31 


Example 6-4. Indirect with Pre-Displacement Subtract 


The address of the operand to be fetched is the contents of an auxiliary register (ARn) minus 
the displacement (disp). The displacement is either an eight-bit unsigned integer contained 
in the instruction word or an implied value of 1. 


Operation: operand address = ARrrdisp 
Assembler Syntax: *-ARn(disp) 
Modification Field: 00001 
31 24 23 0 
31 8 7 0 
31 ¢) 
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Example 6-5. Indirect with Pre-Displacement Add and Modify 


The address of the operand to be fetched is the sum of an auxiliary register (ARn) and the 
displacement (disp). The displacement is either an eight-bit unsigned integer contained in 
the instruction word or an implied value of 1. After the data is fetched, the auxiliary register 
is updated with the address generated. 


Operation: operand address = ARn+disp 


ARn=ARn+disp 


Assembler Syntax: *++ARn(disp) 
Modification Field: 00010 
31 24 23 0 
ARn 


Ce 


31 


Example 6-6. Indirect with Pre-Displacement Subtract and Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn) minus 
the displacement (disp). The displacement is either an eight-bit unsigned integer contained 
in the instruction word or an implied value of 1. After the data is fetched, the auxiliary reg- 
ister is updated with the address generated. 


Operation: operand address = ARn-disp 
ARn=ARn~+disp 
Assembler Syntax: *.-ARn(disp) 
Modification Field: 00010 
31 | 24 23 0 
31 8 7 0 
disp | 0 0...0 0 (—) 


31 0 
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Example 6-7. Indirect with Post-Displacement Add and Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn). After 
the operand is fetched, the displacement (disp) is added to the auxiliary register. The dis- 
placement is either an eight-bit unsigned integer contained in the instruction word or an 
implied value of 1. 


Operation: operand address = ARn 
ARn=ARn+disp 


Assembler Syntax: *ARn++(disp) 


Modification Field: 00100 


31 24 23 0 


Ce 


ARn 


31 8 7 0 


31 0 


Example 6-8. Indirect with Post-Displacement Subtract and Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn). After 
the operand is fetched, the displacement (disp) is subtracted from the auxiliary register. The 
displacement is either an eight-bit unsigned integer contained in the instruction word or an 
implied value of 1. 


Operation: operand address = ARn 
ARn=ARn-disp 
Assembler Syntax: *ARn-- (disp) 
Modification Field: 00101 
31 24 23 0 


ARn 


a 


31 8 7 0 
31 0 
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Example 6-9. Indirect with Post-Displacement Add and Circular Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn). After 
the operand is fetched, the displacement (disp) is added to the contents of the auxiliary re- 
gister using circular addressing. This result is used to update the auxiliary register. The 
displacement is either an eight-bit unsigned integer contained in the instruction word or an 


implied value of 1. 


Operation: operand address = ARn 
ARn=circe(ARn+disp) 
Assembler Syntax: *ARn++(disp)% 
Modification Field: 00110 
31 24 23 0 


31 87 0 
disp | 0 0...0 0] integer (+) 


31 0 


Example 6-10. Indirect with Post-Displacement Subtract and Circular Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn). After 
the operand is fetched, the displacement (disp) is subtracted from the contents of the auxil- 
lary register using circular addressing. This result is used to update the auxiliary register. 
The displacement is either an eight-bit unsigned integer contained in the instruction word 


or an implied value of 1. 


Operation: operand address = ARn 
ARn=cire(ARn-disp) 
Assembler Syntax: *ARn--(disp)% 
Modification Field: 00111 
31 24 23 0 


ARn 


a 


31 8 7 0 


disp | 0 0...0 0) (-) 


31 
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Example 6-11. Indirect with Pre-index Add 


The address of the operand to be fetched is the sum of an auxiliary register (ARn) and an 
index register (IRO or IR1). generated. 


Operation: operand address = ARn+IRm 
Assembler Syntax: *+ARn(IRm) 
Modification Field: 01000 if m=0 
10000 if m=1 
31 24 23 0 


ARn 


31 0 


Example 6-12. Indirect with Pre-Index Subtract 


The address of the operand to be fetched is the difference of an auxiliary register (ARn) and 
an index register (1RO or IR1). 


Operation: operand address = ARn-IRm 
Assembler Syntax: *-ARn(iRm) 
Modification Field: 01001 if m=0 
10001 if m=1 
31 24 23 0 
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Example 6-13. Indirect with Pre-Index Add and Modify 


The address of the operand to be fetched is the sum of an auxiliary register (ARn) and an 
index register (IRO or IR1). After the data is fetched, the auxiliary register is updated with 
the address generated. 


Operation: operand address = ARn+iRm 
ARn=ARn+IRm 


Assembler syntax: *++ARn(IRm) 
Modification Field: 01010 if m=0 
10010 if m=1 
31 24 23 0 
ARn 


Example 6-14. Indirect with Pre-Index Subtract and Modify 


The address of the operand to be fetched is the difference of an auxiliary register (ARn) and 
an index register (IRO or IR1). The resulting address becomes the new contents of the 
auxiliary register. 


Operation: operand address = ARn-IRm 
ARn=ARn-|IRm 
Assembler Syntax: *--ARn(IRm) 
Modification Field: 01011 if m=0 
10011 if m=1 
31 24 23 0 
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Example 6-15. Indirect with Post-Index Add and Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn). After 
the operand is fetched, the index register (IRO or IR1) is added to the auxiliary register. 


Operation: operand address = ARn 
ARn=ARn+IRm 
Assembler Syntax: *ARn++(1Rm) 
Modification Field: 01100 if m=0 
10100 if m=1 
31 24 23 0 


Example 6-16. Indirect with Post-Index Subtract and Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn). After 


the operand is fetched, the index register (IRO or IR1) is subtracted from the auxiliary regis- 
ter. 


Operation: operand address = ARn 
ARn=ARn-IRm 
Assembler Syntax: *ARn--(IRm) 
Modification Field: 01101 if m=0 
10101 if m=1 
31 24 23 0 
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Example 6-17. Indirect with Post-Index Add and Circular Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn). After 
the operand is fetched, the index register (IRO or IR1) is added to the auxiliary register. This 
value is evaluated using circular addressing and replaces the contents of the auxiliary register. 


Operation: operand address = ARn 
ARn=circ(ARn+IRm) 
Assembler Syntax: *ARn++(IRm)% 
Modification Field: 01110 if m=0 
10110 if m=1 
31 24 23 0 
ARn 


Example 6-18. Indirect with Post-Index Subtract and Circular Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn). After 
the operand is fetched, the index register (IRO or IR1) is subtracted from the auxiliary regis- 
ter. This value is evaluated using circular addressing and replaces the contents of the auxil- 
lary register. 


Operation: operand address = ARn 
ARn=circ(ARn-!IRm) 
Assembler Syntax: *“ARn--(IRm)% 
Modification Field: 01111 if m=0 
10111 if m=1 
31 24 23 0 


ARn 


a 
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Example 6-19. Indirect with Post-Index Add and Bit-Reversed Modify 


The address of the operand to be fetched is the contents of an auxiliary register (ARn). After 
the operand is fetched, the index register (IRO) is added to the auxiliary register. This addi- 
tion is performed with a reverse-carry propagation and can be used to yield a bit-reversed 
(B) address. This value replaces the contents of the auxiliary register. 


Operation: operand address = ARn 
ARn=B(ARn+IRO) 
Assembler Syntax: *ARn++(1RO)B 
Modification Field: 11001 
31 24 23 0 


ARn 


a 


31 24 23 0 


6.1.4 Short-Immediate Addressing 


In short-immediate addressing, the operand is a 16-bit immediate value con- 
tained in the 16 least-significant bits of the instruction word (expr). De- 
pending upon the data types assumed for the instruction, the short-immediate 
operand may be a two’s-complement integer, an unsigned integer, or a float- 
ing-point number. The syntax for this mode is listed below. 


Syntax: expr 


Example 6-20 gives an instruction example with before and after instruction 
data. 


Example 6-20. Short-iImmediate Addressing 


SUBI 1,R0 
Before Instruction: After Instruction: 
RO = Oh RO = OFFFFFFFFh 
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6.1.5 Long-Immediate Addressing 


In long-immediate addressing, the operand is a 24-bit immediate value con- 
tained in the 24 least-significant bits of the instruction word (expr). The syn- 
tax for this mode is listed below. 


Syntax: expr 


Example 6-21 gives an instruction example with before and after instruction 
data. 


Example 6-21. Long-Immediate Addressing 
BR 8000h 


Before Instruction: After Instruction: 
PC = Oh PC = 8000h 


6.1.6 PC-Relative Addressing 


PC-relative addressing is used for branching. It replaces the value of the PC oa 
based upon the contents of the 16 least significant bit of the instruction word. 

The assembler takes the src (a label or address) specified by the user and 
generates a displacement. If the branch is a standard branch, this displace- 

ment is equal to the label - (PC+1). If the branch is a delayed branch, this 
displacement is equal to the label - (PC+3). 


The displacement is stored as a 16-bit signed integer in the least significant 
bits of the instruction word. 


Syntax: expr 


Example 6-22 gives an instruction example with before and after instruction 
data. 


Example 6-22. PC-Relative Addressing 


BU NEWPC ; pc=1l, NEWPC=5, displacement=3 
Before Instruction: After Instruction: 
PC = th PC = 5h 
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6.2 Groups of Addressing Modes 


The types of addressing are used to form the following five groups of ad- 
dressing modes: 


@ General addressing modes (G) 
Three-operand addressing modes (T) 
Parallel addressing modes (P) 
Long-immediate addressing mode 
Conditional-branch addressing modes (B) 


These groups of addressing modes are discussed in the following sections. 


6.2.1 General Addressing Modes 


Instructions that use the general addressing modes are general-purpose in- 
structions, such as ADDI, MPYF, and LSH. Such instructions are usually of 
the form: 


dst operation src > dst 


where the destination operand is signified by dst, the source operand by sre, 
and ‘operation’ defines an operation to be performed using the general ad- 
dressing modes to specify certain operands. Bits 31-29 are zero, indicating 
general addressing mode instructions. Bits 22 and 21 specify the general ad- 
dressing mode (G) field, which defines how bits 15 through O are to be in- 
terpreted for addressing the src operand. 


Options for bits 22 and 21 (G field) are as follows: 
00 register (all CPU registers unless specified otherwise) 
01. direct 
10 indirect 
11 immediate 


If the src and dst fields contain register specifications, the value in these fields 
contains the CPU register addresses as defined by Table 6-1. For the general 
addressing modes, the following values of ARn are valid: 


ARn,O<n</7 


Figure 6-2 shows the encoding for the general addressing modes. The nota- 
tion ‘modn’ indicates the modification field that goes with the ARn field. Refer 
to Table 6-2 for further information. 


31 29 28 23 22 21 20 16 15 11 10 87 


SRE CT SR (CNET CCORT RA 
I Ca OE a A a 
}0 0 Of operation 1 Of dst | modn ff ARn | disp 
10 0 Of operation oo f1 tf dst | immediate 


1 G I Destination | Source Operands | 


Figure 6-2. Encoding for General Addressing Modes 
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6.2.2 Three-Operand Addressing Modes 


Instructions that use the three-operand addressing modes, such as ADDI3, 
LSH3, CMPF3. or XOR3, are usually of the form: 


SRC1 operation SRC2 > dst 


where the destination operand is signified by dst, the source operands by 
SRC1 and SRC2, and ‘operation’ defines an operation to be performed. Note 
that the ‘3’ can be omitted from three-operand instructions. 


Bits 31-29 are set to the value of 001, indicating three-operand addressing 
mode instructions. Bits 22 and 21 specify the three-operand addressing mode 
(T) field, which defines how bits 15-0 are to be interpreted for addressing the 
SRC operands. Bits 15-8 are used to define the SRC1 address, and bits 7-0 
to define the SRC2 address. Options for bits 22 and 21 (T) are as follows: 


T SRC1 SRC2 

00 Register Register 
0 1 Indirect Register 
10 Register Indirect 
1 1 Indirect Indirect 


Figure 6-3 shows the encoding for three-operand addressing. If the SRC1 
and SRC2 fields use the same auxiliary register, both addresses are correctly 
generated. However, only the value created by the SRC1 field is saved in the 
auxiliary register specified. The assembler issues a warning if this condition 
is specified by the user. 


The following values of ARn and ARm are valid: 


ARn,O <n < 7 
ARm,0O <m</7 


The notation "modm” or “modn” indicates the modification field goes with the 
ARm or ARn field respectively. Refer to Table 6-2 for further information. 


In indirect addressing of the three-operand addressing mode, displacements 
(if used) are allowed of O or 1, and the index registers (IRO and IR1) can be 
used. The displacement of 1 is implied and is not explicitly coded in the in- 
struction word. 


31 29 28 23 22 21 20 1615 1312 1110 87 54 32 0 


lr | | SRC1 | SRC2 | 


Figure 6-3. Encoding for Three-Operand Addressing Modes 
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6.2.3 Parallel Addressing Modes 


Instructions that use parallel addressing (indicated by || (two vertical bars) ) 
allow for the greatest amount of parallelism possible. The destination oper- 
ands are indicated as d1 and d2, signifying dst1 and dst2, respectively (see 
Figure 6-4). The source operands, signified by src1 and src2, use the ex- 
tended-precision registers. The parallel operation to be performed is notated 
as ‘operation’. 


31 30 29 26 25 24 23 22 21 1918 1615 11.10 
Oe ea 


| src3 | src4 | 


Figure 6-4. Encoding for Parallel Addressing Modes 


The parallel addressing mode (P) field specifies how the operands are to be 
used, i.e., whether they are source or destination. The specific relationship 
between the P field and the operands is detailed in the description of the in- 
dividual parallel instructions (see Section 11). However, the operands are al- 
ways encoded in the same way. Bits 31 and 30 are set to the value of 10, 
indicating parallel addressing mode instructions. Bits 25 and 24 specify the 
parallel addressing mode (P) field, which defines how bits 21-0 are to be in- 
terpreted for addressing the src operands. Bits 21-19 are used to define the 
src address, bits 18-16 to define the src2 address, bits 15-8 the src3 address, 
and bits 7-0 the src4 address. The notation ‘modn’ and ‘modm’ indicate 
which modification field goes with which ARn or ARm (auxiliary register) 
field, respectively. The parallel addressing operands are listed below. 


src? QO < src? <7 (extended-precision registers RO-R7) 
src2. O< src2 <7 (extended-precision registers RO-R7) 
d1 If O, dst7 is RO. If 1, dst7 is R1. 

d2 If O, dst2 is R2. If 1, dst2 is R3. 

P O<P<3 

src3 indirect (disp = O, 1, IRO, IR1) 


src4_ indirect (disp = 0, 1, IRO, IR1) 


As in the three-operand addressing mode, indirect addressing in the parallel 
addressing mode allows for displacements of 0 or 1 and the use of the index 
registers (IRO and IR1). The displacement of 1 is implied and is not explicitly 
coded in the instruction word. 


In the encoding shown for this mode in Figure 6-4, if the src3 and src4 fields 
use the same auxiliary register, both addresses are correctly generated, but 
only the value created by the src3 field is saved in the auxiliary register speci- 
fied. The assembler issues a warning if this condition is specified by the user. 
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6.2.4 Long-Immediate Addressing Mode 


The long-immediate addressing mode is used to encode the program control 
instructions (BR, BRD, and CALL), for which it is useful to have a 24-bit ab- 
solute address contained in the instruction word. The unconditional branches, 
BR (standard) and BRD (delayed), use the long-immediate addressing mode. 
Bits 31-25 are set to the value of 0110000, indicating long-immediate ad- 
dressing mode instructions. Selection of bit 24 determines the type of branch: 
D = O for a standard branch or D = 1 for a delayed branch. The long-immed- 
late operand is the 24-bit src. These instructions are encoded as shown in 
Figure 6-5. 


25 24 23 


io) 


0 0 0 O{D| src 


Figure 6-5. Encoding for Long-Immediate Addressing Mode 


6.2.5 Conditional-Branch Addressing Modes 


eee 


27 26 25 24 22 21 20 1615 54 


Instructions using the conditional-branch addressing modes (Bcond, 
BcondD, CALLcond, DBcond, and DBcondD) can perform a variety of con- 
ditional operations. Bits 31-27 are set to the value of 01101, indicating con- 
ditional-branch addressing mode instructions. Bit 26 is set to 0 or 1, the 
former selects DBcond, the latter Bcond. Selection of bit 25 determines the 
conditional-branch addressing mode (B). If B = O, register addressing is 
used; if B = 1, PC-relative addressing is used. Selection of bit 21 sets the type 
of branch: D = O for a standard branch or D = 1 for a delayed branch. The 
condition field (cond) specifies the condition checked to determine what ac- 
tion to take, i.e., whether or not to branch (see Section 11 for a list of condi- 
tion codes). Figure 6-6 shows the encoding for conditional-branch 
addressing. 


oO 


CC CCRC eC ea 


Bcond(D): 


jo 1 1 0 tf1fBfo o ofof] cond immediate 


| src | 


Figure 6-6. Encoding for Conditional-Branch Addressing Modes 
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6.3 Circular Addressing 
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Many algorithms, such as convolution and correlation, require the implemen- 
tation of a circular buffer in memory. In convolution and correlation, the cir- 
cular buffer is used to implement a sliding window which contains the most 
recent data to be processed. As new data is brought in, the new data over- 
writes the oldest data. Key to the implemention of a circular buffer is the im- 
plementation of a circular addressing mode. This section describes the circular 
addressing mode of the TMS320C30. 


The blocksize register (BK) specifies the size of the circular buffer. Informa- 
tion concerning the lower 16 bits of the BK register plus a user-selected aux- 
iliary register (ARn) are used to specify the bottom of the circular buffer. The 
information concerning the BK register is the location of the the first 1 bit, 
counting from the most-significant bit to the least-significant bit, in the lower 
16 bits. With the location of the first 1 bit specified as bit N, the address at 
the top of the buffer is referred to as the effective base (EB) and is equal to 
bits 31 through (N+1) of ARn with bits N through O of EB being zero. 


Figure 6-7 illustrates the relationships between the blocksize register (BK), the 
auxiliary registers (ARn), the bottom of the circular buffer, the top of the cir- 
cular buffer, and the index into the circular buffer. 
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FIRST 1 AT LOCATION N 
31 N+1N 
1(N ae 
OF BK) 


N+1N N+1N 


OF BK) 


BOTTOM OF BUFFER +1 


N+1N 


arnt” 


0 


CIRCULAR 
ADDRESSING 
ALGORITHM 
LOGIC 


LEGEND 
ARn = auxiliary register n L = low-order bits 
BK = blocksize register L’ = new low-order bits 
EF = effective base LSB = least-significant bit 
H = high-order bits N = bit value 


Figure 6-7. Flowchart for Circular Addressing 


In circular addressing, ‘index’ refers to the N LSBs of the auxiliary register se- 
lected, and ‘step’ is the quantity being added to or subtracted from the auxil- 
lary register. The following two rules must be followed when using circular 
addressing: 


e The step used must be less than or equal to the blocksize. 


e The first time the circular queue is addressed, the auxiliary register must 
be pointing to an element in the circular queue. 
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The algorithm for circular addressing is as follows: 


If O < index + step < BK: 

index = index + step. 
Else if index + step > BK: 

index = index + step - BK. 
Else if index + step < 0: 

index = index + step + BK. 


Figure 6-8 shows how the circular buffer is implemented. It illustrates the re- 
lationship of the quantities generated and the elements in the circular buffer. 


Address Data | 
31 N+1 N 0 Top of Circular Buffer 


Effective Base (EB) {_ HH | 0.0 | > 


Element 1 | 


Element (N LSBs of ARn) 


Aux. Register (AR) . 


31 N+1 N 


(Ses of aK ] + 


Figure 6-8. Circular Buffer Implementation 


© 


Last Element 


Last Element + 1 


Figure 6-9 provides an example that shows the operation of circular address- | 
ing. Assuming that all ARs are four bits, let ARO = 0000,and BK = 0110 
(blocksize of 6). This example shows a sequence of modifications and the 
resulting value of ARO. It also shows how the pointer steps through the cir- 
cular queue, with a variety of step sizes (both incrementing and decrement- 
ing). 
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* ARO ; ARO = 0 (Oth value) 
*ARO++(5)% ; ARO = 5 (lst value) 
*RRO++(2)% ; ARO = 1 (2nd value) 
*ARO--(3)% ; ARO = 4 (3rd value) 
*ARO++(6)% ; ARO = 4 (4th value) 
*ARO--% ; ARO = 3 (5th value) 
Value Data Address 
oth > 0 
2nd > 
2 
sth > 3 
1st > Element 5 (Last Element) 5 
Last Element + 1 6 


Figure 6-9. Circular Addressing Example 


Circular addressing is especially useful for the implementation of FIR filters. 
Figure 6-10 shows one possible data structure for FIR filters. Note that the 
initial value of ARO points to h(N-1), and the initial value of AR1 points to 
x(O). Circular addressing is used in the TMS320C30 code for the FIR filter 
shown in Figure 6-11. 


Impulse Response input Samples 


Figure 6-10. Data Structure for FIR Filters 
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* Initialization 
* 


EDT N,BK 
LDI H,ARO 
LDI X,AR1L 

* 

* 

TOP LDF IN, R3 
STF R3, *AR1++3% 
LDF 0,RO 
LDF 0,R2 

* 

*, Filter 

* 
RPTS N-1 
MPYF3 

|| ADDF3 RO,R2,R2 
ADDF RO,R2 

* 
STF R2,Y 
B TOP 

Figure 6-11. 
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s 


Load block size. 
Load pointer to impulse response. 
Load pointer to bottom of input 
sample buffer. 


Read input sample. 

Store with other samples. 
and point to top of buffer. 
Initialize RO. 

Initialize R2. 


Repeat next instruction. 


*ARO++%, *AR1++%,RO 


Multiply and accumulate. 
Last product accumulated. 


Save result. 
Repeat. 


FIR Filter Code Using Circular Addressing 
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6.4 Bit-Reversed Addressing 


Bit-reversed addressing on the TMS320C30 is useful in FFT algorithms using 
a variety of radices. One auxiliary register is used as a pointer to the physical 
location of a data value. IRO is used to specify the size of the FFT; e.g., the 
value contained in [RO must be equal to 2" where n is an integer. By adding 
IRO to the auxiliary register using bit-reversed addressing, addresses are gen- 
erated in a bit-reversed fashion. 


To illustrate this kind of addressing, assume eight-bit auxiliary registers. Let 
AR2 contain the value 0110 0000 (96). This is the base address of the data 
in memory. Let !RO contain the value 0000 1000 (8). Figure 6-12 shows a 
sequence of modifications of AR2 and the resulting values of AR2. 


*AR2 : AR2 = 0110 OO00 (Oth value) 
*AR2++(IRO)B >; AR2 = 0110 1000 (1st value) 
*AR2++(IRO)B ; AR2 = 0110 0100 (2nd value) 
*AR2++(IRO)B : AR2 = 0110 1100 (3rd value) 
*AR2++(IRO)B : AR2 = 0110 0010 (4th value) 
*AR2++(IRO)B : AR2 = 0110 1010 (5th value) 
*AR2++(IRO)B ; AR2 = 0110 0110 (6th value) 
*AR2++(IRO)B ; AR2 = 0110 1110 (7th value) 


Figure 6-12. Bit-Reversed Addressing Example 


Table 6-3 shows the relationship of the index steps and the four LSBs of AR2. 
It can be seen that the four LSBs can be found by reversing the bit pattern of 
the steps. 


Table 6-3. Index Steps and Bit-Reversed Addressing 


STEP | BIT PATTERN BIT-REVERSED PATTERN BIT-REVERSED STEP 
0 0 
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6.5 System and User Stack Management 


The TMS320C30 provides a dedicated system stack pointer (SP) for building 
stacks in memory. The auxiliary registers can also be used to build a variety 
of more general linear lists. This section discusses the implementation of the 
following types of linear lists: 


Stack A linear list for which all insertions and deletions are made at one 
end of the list. 


Queue _§ A linear list for which all insertions are made at one end of the list, 
and all deletions are made at the other end. 


Deque A ‘double-ended queue’ linear list for which insertions and deletions 
are made at the either end of the list. 


The system stack pointer (SP) is a 32-bit register that contains the address of 
the top of the system stack. The system stack fills from low-memory address 
to high-memory address (see Figure 6-13). The SP always points to the last 
element pushed onto the stack. A push performs a preincrement; and a pop, 
a postdecrement of the system stack pointer. 


The program counter is pushed on the system stack on subroutine calls, traps, 
and interrupts. It is popped from the system stack on returns. The system stack 
can be pushed and popped using the PUSH, POP, PUSHF, and POPF in- 
structions. 


LOW MEMORY 


BOTTOM OF STACK 


SP > TOP OF STACK 
(FREE) 


HIGH MEMORY 


Figure 6-13. System Stack Configuration 


6.5.1 Stacks 
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Stacks can be built from low to high memory or high to low memory. Two 
cases for each type of stack are shown. Stacks can be built using the 
preincrement/decrement and postincrement/decrement modes of modifying 
the auxiliary registers (AR). Stack growth from high-to-low memory can be 
implemented in two ways: 


CASE 1: Stores to memory using *--ARn to push data on the stack and reads 
from memory using *“ARn++ to pop data off the stack. 


CASE 2: Stores to memory using *ARn-- to push data on the stack and reads 
from memory using * ++ARn to pop data off the stack. 
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Figure 6-14 illustrates these two cases. The only difference is that in using 
case 1, the AR always points to the top of the stack, and in case 2, the AR 
always points to the next free location on the stack. 
CASE 1 CASE 2 
LOW MEMORY LOW MEMORY 


(FRED Ann ~ 
ARn > TOP OF STACK TOP OF STACK 


BOTTOM OF STACK 


BOTTOM OF STACK 


HIGH MEMORY HIGH MEMORY 


Figure 6-14. Implementions of High-to-Low Memory Stacks 


Stack growth from jow-to-high memory can be implemented in two ways: io 


CASE 3: Stores to memory using *++ARn to push data on the stack and 
reads from memory using *ARn-- to pop data off the stack. 


CASE 4: Stores to memory using *“ARn++ to push data on the stack and 
reads from memory using *--ARn to pop data off the stack. 


Figure 6-15 shows these two cases. In the case 3, the AR always points to 
the top of the stack. In case 4, the AR always points to the next free location 
on the stack. 


CASE 3 CASE 4 
LOW MEMORY LOW MEMORY 


BOTTOM OF STACK BOTTOM OF STACK 


TOP OF STACK TOP OF STACK 


ARn-> 


(FREE) ARn > (FREE) 
HIGH MEMORY HIGH MEMORY 


Figure 6-15. Implementions of Low-to-High Memory Stacks 
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6.5.2 Queues and Deques 
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The implementations of queues and deques is based upon the manipulation 
of the auxiliary registers for user stacks. For queues, two auxiliary registers 
are used, one to mark the front of the queue from which data is popped and 
the other to mark the rear of the queue where data is pushed. 


For deques, two auxiliary registers are also necessary. One is used to mark 
one end of the deque, and the other is used to mark the other end. Data can 
be popped or pushed from either end. 
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Program Flow Control 


The TMS320C30 provides a complete set of flexible and powerful constructs 
that allow for software control of the program flow. These consist of two main 
types: repeat modes and branching (standard and delayed). When program- 
ming includes a combination of repeat modes, standard branches, and delayed 
branches, the type best suited for a particular application can be selected. 


Several interlocked operations instructions provide a flexible means of multi- 
processor support. Through the use of external signals, these instructions al- 
low for powerful synchronization mechanisms. They also guarantee the 
integrity of the communication and result in a high-speed operation. 


The TMS320C30 supports a nonmaskable external reset signal and a number 
of internal and external interrupts. These functions can be programmed for a 
particular application. 


Major topics discussed in this section include: 


@ Repeat Modes (Section 7.1 on page 7-2) 

= Initialization 

= Operation 

Delayed Branches (Section 7.2 on page 7-7) 
interlocked Operations (Section 7.3 on page 7-8) 


Reset Operation (Section 7.4 on page 7-12) 


Interrupts (Section 7.5 on page 7-15) 
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7.1 Repeat Modes 


The repeat modes of the TMS320C30 allow for the implementation of zero- 
overhead looping. For many algorithms, there is an inner kernel of code where 
most of the execution time is spent. Using the repeat modes allows these 
time-critical sections of code to be executed in the shortest possible time. 


The TMS320C30 provides two instructions to support zero-overhead looping: 
RPTB (repeat a block of code) and RPTS (repeat a single instruction). RPTB 
allows for a block of code to be repeated a specified number of times. RPTS 
allows a single instruction to be repeated a number of times and reduces the 
bus traffic by fetching the instruction only once. is 


Three registers (RS, RE, and RC) are associated with the updating of the 
program counter when updated in a repeat mode. Table 7-1 describes these 
registers. 


Table 7-1. Repeat Mode Registers 


REGISTER FUNCTION 
RS Repeat Start Address Register. Holds the address of the first instruction 
of the block of code to be repeated. > 
Repeat End Address Register. Holds the address of the last instruction 
of the block of code to be repeated. 
RC Repeat Count Register. Contains one less than the number of times the 
block remains to be repeated. 


7.1.1 Repeat Mode Initialization 
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There are two bits that are very important to the operation of RPTB and RPTS, 
the RM and S bits. 


The RM (repeat mode flag) bit in the status register specifies if the processor 
is running in the repeat mode. If RM = 0, fetches are not made in repeat 
mode, if RM = 1, fetches are made in repeat mode. 


The S bit is hidden from the user, but is necessary to fully describe the oper- 
ation of RPTB and RPTS. If S = 0, the CPU is not performing fetches in the 
repeat-single mode. If S = 1 and RM = 1, the CPU is performing fetches in 
the repeat-single mode. 


The correct operation of the repeat modes requires that all of the above regis- 
ters and status register fields be initialized correctly. The RPTB and RPTS in- 
structions perform this initialization in slightly different ways (see Sections 
7.1.2 and 7.1.3). 
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7.1.2 RPTB Initialization 
When RPTEB src is executed, the following operations take place: 


1) PC+1-RS 


2) sre > RE 
3)  1-—- RM status register bit 
4) O-S bit. 


Step 1 loads the start address of the block into RS. Step 2 loads the src into 
the RE (end address of the block). The sre operand is a 24-bit value con- 
tained in the instruction word. Step 3 sets the status register to indicate the 
repeat mode of operation. Step 4 indicates that this is the repeat block mode 
of operation. 


The last bit of information required is the number of times to repeat the block. 
The value is determined by properly initializing the RC (repeat count) register. 
Since the execution of RPTB does not load the RC, this register must be 
loaded explicitly by the user. The typical setup of the block repeat operation 
is shown below. 


LDI L5,RC: 3152 RE 
RPTB LOOP 7 LOOP => RE» PC. + ol = “RS; 1 > RM, 0. -S 


The repeat modes repeat a block of code at least once in a typical operation. 
The repeat counter should be loaded with one less than the number of times 
to repeat the block; i.e., a value of 0 in RC repeats the block of code one time. 
All block repeats initiated by RPTB can be interrupted. 


7.1.3 RPTS Initialization 
When RPTS src is executed, the following sequence of operations occurs: 


1) PC+1->RS 

2) PC+1-—>RE 

3) 1-— RM status register bit 
4) 1-8 bit 

5) sre>RC 


The RPTS instuction loads all registers and mode bits necessary for the oper- 
ation of the single instruction repeat mode. Step 1 loads the start address of 
the block into RS. Step 2 loads the end address into the RE (end address of 
the block). Since this is a repeat of a single instruction, the start address and 
the end address are the same. Step 3 sets the status register to indicate the 
repeat mode of operation. Step 4 indicates that this is the repeat single-in- 
struction mode of operation. The operand src is loaded into RC. 


Repeats of a single instruction initiated by RPTS are not interruptible, since 
the RPTS fetches the instruction word only once and then keeps it in the in- 
struction register for reuse. An interrupt would cause the instruction word to 
be lost. The refetching of the !nstruction word from the instruction register 
reduces memory accesses and, in effect, acts as a one-word program cache. 
If it is necessary to have a single instruction that is repeatable and interruptible, 
the RPTB instruction may be used on this single instruction. 
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7.1.4 Repeat Mode Operation 


The information in the repeat mode registers and associated control bits is 
used to control the modification of the PC when the fetches are being made 
in repeat mode. The repeat modes compare the contents of the RE register 
with the program counter (PC). If they match and the repeat counter is non- 
negative, the repeat counter is decremented, the PC is loaded with the repeat 
start address, and the processing continues. The fetches and appropriate sta- 
tus bits are modified as necessary. Note that the repeat counter (RC) is never 
modified when RM is 0. The maximum number of repeats occurs when RC = 
080000000h. This will result in O80000001h repetitions. The detailed algo- 
rithm for the update of the PC is described in Figure 7-1. 


if RM == ; If in repeat mode (RPTB or RPTS) 
ifS == ‘if RPTS 
if first time through ; If this is the first fetch 
fetch instruction from memory ; Fetch instruction from memory 
else ; If not the first fetch 
fetch instruction from IR ; Fetch instruction from IR 
RC-1—7 RC ; Decrement RC 
if RC <0 ; If RC is negative 
; Repeat single mode completed 
0 > ST(RM) ; Turn off repeat mode bit 
7 0-S ; Clear S 
PC + 1. > PC ; Increment PC 
else if S == ; If RPTB 
fetch instruction from memory ; Fetch instruction from memory 
if PC == RE ; If this is the end of the block 
RC-1—7 RC ; Decrement RC 
ifRC>0O ; If RC is not negative 
RS ~ PC ; Set PC to start of block 
else if RC < 0 ; If RC is negative 
0 > ST(RM) ; Turn off repeat mode bits 
0o-S ; Clear S 
PC +17 PC ; Increment PC 


Figure 7-1. Repeat Mode Control Algorithm 


The RPTB and RPTS are four-cycle instructions. These four cycles of over- 
head are only incurred on the first pass through the loop. All subsequent 
passes through the loop are accomplished with zero cycles of overhead. In 
Example 7-1, the block of code from STLOOP to ENDLOP is repeated sixteen 
times. 


Example 7-1. RPTB Operation 


LD 15,RC ; Load repeat counter with 15 
RPTB ENDLOP ; Execute the block of code 


STLOOP ; from STLOOP to ENDLOP 16 times 


ENDLOP 
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Using this mode of modifying the PC allows for a straightforward analysis of 
what would happen in the case of branches within the block. It is best to look 
at the operation from the point of view that the next value of the PC will be 
either PC + 1 or the contents of the RS register. It is thus apparent that this 
method of block repeat allows for any amount of branching within the re- 
peated block. Execution can go anywhere within the user’s code via interrupts, 
subroutine calls, etc. For proper modification of the loop counter, the last in- 
struction of the loop must be fetched. The repeating of the loop can be 
stopped prior to completion by writing a O into the repeat counter or writing 
0 into the RM bit of the status register. 


Since the block repeat modes modify the program counter, other instructions 
cannot modify the program counter at the same time. Two rules apply here: 


1) The last instruction in the block (or the only instruction in a block of size 
one) cannot be a Bcond, BR, DBcond, CALL, CALLcond, TRAPcond, 
RETicond, RETScond, |DLE, RPTB, or RPTS. Example 7-2 shows an 
incorrectly placed standard branch. 


2) None of the last four instructions from the bottom of the block (or the 
only instruction in a block of size one) can be a BcondD, BRD, or 
DBcondD. Example 7-3 shows an incorrectly placed delayed branch. 


lf either of these rules are violated, the PC will be undefined. 


Example 7-2. Incorrectly Placed Standard Branch 
LD 15 RC ; Load repeat counter with 15 
RPTB ENDLOP ; Execute block of code 
STLOOP ; from STLOOP to ENDLOP 16 times 
JCS 
ENDLOP BR OOPS ; This branch violates rule l 


Example 7-3. Incorrectly Placed Delayed Branch 


LD 15,RC ; Load repeat counter with 15 
RPTB ENDLOP ; Execute block of code 

STLOOP »; from STLOOP to ENDLOP 16 times 
GAF 
BRD OOPS > This branch violates rule 2 
ADDF 
MPYF 

ENDLOP SUBF 


Block repeats (RPTB) are nestable. Since all of the control is defined by the 
RS, RE, RC, and ST registers, saving and restoring these registers allows for 
their nesting. The RM bit in the status register can be used to determine if the 
block repeat mode is active. For example, if an interrupt service routine is 
written which requires the use of RPTB, it is possible that the interrupt asso- 
ciated with the routine may occur during another block repeat. The interrupt 
service routine can check the RM bit. If this bit is set, the interrupt routine 
saves RS, RE, RC, and ST. The interrupt routine can then perform a block re- 
peat. Before returning to the interrupted routine, the interrupt routine restores 
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RS, RE, RC, and ST. If the RM bit is not set, the save and restore of these re- 
gisters is not necessary. 
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7.2 Delayed Branches 


The branching capabilities of the TMS320C30 include two main types: stan- 


dard and delayed branches. 


Standard branches empty the pipeline before 


performing the branch to guarantee correct management of the program 
counter. This results in a TMS320C30 branch taking four cycles. Included in 
this class are calls, returns, and traps. 


Delayed branches on the TMS320C30 do not empty the pipeline, but rather 


guarantee that the next three instructions will be executed before the program 
counter is modified by the branch. The result is a branch that only requires a 
single cycle, thus making the speed of the delayed branch very close to the 
optimal block repeat modes of the TMS320C30. However, unlike block repeat 


modes, delayed branches may be used in situations other than looping. Every 


delayed branch has a standard branch counterpart that is used when a delayed 
The delayed branches of the TMS320C30 are 
BcondD, BRD, and DBcondD. 


branch cannot be used. 


Conditional delayed branches use the conditions that exist at the end of the 
instruction immediately preceding the delayed branch. They do not depend 


upon the instructions following the delayed branch. 


Delayed branches are 


guaranteed to allow the three following instructions to be executed regardless 


of other pipeline conflicts. 


When a delayed branch is fetched, it remains pending until the three following 
instructions are executed. None of the three instructions that follow a delayed 
branch can be Bcond, BcondD, BR, BRD, DBcond, DBcondD, CALL, 
CALLcond, TRAPcond, RETIlcond, RETScond, RPTB, RPTS, or IDLE. 
Example 7-4). 


(see 


Delayed branches disable interrupts until the three instructions following the 
delayed branch are completed. This is independent of whether or not the 


branch is taken. 


if delayed branches are used incorrectly, the PC will be undefined. 


Example 7-4. Incorrectly Placed Delayed Branches 


Bl: 


B2: 


BD 
NOP 
NOP 
B 
NOP 
MJH 
NOP 


Li 


L2 


; This branch is incorrectly placed 
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7.3 Interlocked Operations 
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One of the most common multiprocessing configurations is the sharing of 
global memory by multiple processors. In order to allow multiple processors 
to access this global memory and share data in a coherent manner, some sort 
of arbitration or handshaking is necessary. This requirement for arbitration is 
the purpose of the TMS320C30 interlocked operations. 


The TMS320C30 provides a flexible means of multiprocessor support with five 
instructions, referred to as interlocked operations. Through the use of external 
signals, these instructions allow for powerful synchronization mechanisms. 
They also guarantee the integrity of the communication and result in a high- 
speed operation. The interlocked-operation instruction group is listed in Table 
7-2. 


Table 7-2. Interlocked Operations 


| MNEMONIC| DESCRIPTION OPERATION 
LDFI Load floating-point value into a register, Signal interlocked 
! interlocked src — dst 
LDII Load integer into a register, interlocked Signal interlocked 
src — dst 
SIGI Signal, interlocked Signal interlocked 
Clear interlock 
STFI | Store floating-point value to memory, src — dst 
interlocked Clear interlock 
STil Store integer to memory, interlocked src — dst 
Clear interlock 


The interlocked operations use the two external flag pins, XFO and XF1. XFO 
must be configured as an output pin, and XF1 as an input pin. When config- 
ured in this manner, XFO signals an interlock operation request, and XF1 acts 
as an acknowledge signal for the requested interlocked operation. In this 
mode, XFO and XF1 are treated as active-low signals. 


The external timing for the interlocked loads and stores are the same as stan- 
dard load ard stores. The interlocked loads and stores may be extended like 
standard accesses, by using the appropriate ready signal (RDY or XRDY). 


The LDF! and LDII instructions perform the following actions: 


1) Simultaneously set XFO to 0 and begin a read cycle. The timing of XFO 
is similar to that of the address bus during a read cycle. 

2) Execute an LDF or LDI instruction and extend the read cycle until XF1 
is set to O and a ready (RDY or XRDY) is signalled. 

3) Leave XFO set to O and end the read cycle. 


The read/write operation is identical to any other read/write cycle except for 
the special use of XFO and XF1. The sre operand for LDF! and LDII is always 
a direct or indirect memory address. XFO is set to O only if the src is located 
off-chip; i.e., (STRB, MSTRB or |OSTRB is active), or the src is one of the on- 
chip peripherals. If on-chip memory is accessed, then XFO is not asserted, and 
the operation is as an LDF or LDI from internal memory. 
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The STFI and STI! instructions perform the following operations: 


1) Simultaneously set XFO to 1 and begin a write cycle. The timing of XFO 
is similar to that of the address bus during a write cycle. 

2) Execute an STF or STI instruction and extend the write cycle until a 
ready (RDY or XRDY) is signalled. 


As in the case for LDF! and LDII, the dst of STFI and STII affects XFO if dst 
is located off-chip (STRB, MSTRB, or IOSTRB is active), or the src is one of the 
on-chip peripherals. If on-chip memory is accessed, then XFO is not asserted 
and the operations are as an STF or STI to internal memory. 


The SIGI instruction functions as follows: 


1) Sets XFO to 0. 
2)  Idles until XF1 is set to 0. 
3) Sets XFO to 1 and ends the operation. 


While the LDFI, LDII, and SIGI instructions are waiting for XF1 to be set to 
0, they may be interrupted. LDFI and LDII require a ready signal in order to 
be interrupted. This allows the user to implement protection mechanisms 
against deadlock conditions by interrupting an interlocked load that has taken 
too long. Upon return from the interrupt, the next instruction is executed. The 
STFI and STil instructions are not interruptible. 


Interlocked operations can be used to implement a busy-waiting loop, to ma- 
nipulate a multiprocessor counter, to implement a simple semaphore mech- 
anism, or to perform synchronization between two TMS320C30s. The 
following examples illustrate the usefulness of the interlocked operations in- 
structions. Example 7-5 shows the implementation of a busy-waiting loop. If 
location LOCK its the interlock for a critical section of code, and a nonzero 
means the lock is busy, the algorithm for a busy-waiting loop can be used as 
shown in Example 7-5. 


Example 7-5. Busy-Waiting Loop 


Put 1 in RO 

Interlocked operation begun 
Contents of LOCK > RI 

Put RO (= 1) into LOCK, XFO = 1 
Interlocked operation ended 
Keep trying until LOCK = 0 


LDI: 1,R0 
Ll: LDII @LOCK,R1 


STII RO,@LOCK 


=e “ce we So NS NO 


BNZ Lil 


Example 7-6 shows how a location COUNT may contain a count of the 
number of times a particular operation needs to be performed. This operation 
may be performed by any processor in the system. If the count is zero, the 
processor waits until it is nonzero before beginning processing. The algorithm 
for modifying COUNT correctly is shown in Example 7-6. 
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Example 7-6. Multiprocessor Counter Manipulation 
CT: OR 4,I10F XFO = 1 

Interlocked operation ended 

Interlocked operation begun 

Contents of COUNT ~ R1 

If COUNT = 0, keep trying 

Decrement R1(= COUNT) 

Update COUNT, XFO = 1 

Interlocked operation ended 


LDII @COUNT,R1 


BZ CT 
SUBI 1,R1 
STII R1,@COUNT 


“ws “Se te Se Me NU NE OWE 


Figure 7-2 illustrates multiple TMS320C30s sharing global memory and using 
the interlocked instructions as in Examples 7-7, 7-8, 7-9, and 7-10. 


Arbitration 
Logic 
Lock, Count, or S | 
XFO XF1 


1 
TMS320C30 acid 


XFO XF 


Figure 7-2. Multiple TMS320C30s Sharing Global Memory 


Sometimes it may be necessary for several processors to access some shared 
data or other common resources. The portion of code which must access the 
shared data is called a critical section. 


To ease the programming of critical sections, semaphores may be used. Se- 
maphores are variables which can only take non-negative integer values. Two 
primitive, indivisible, operations are defined on semaphores, namely (with S$ 
being a semaphore): 
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V(S) S+173 58S 
P(S) P: if (S == 0), go to P 
else S -17 8S 


Indivisibility of V(S) and P(S) means that when these processes access and 


modify the semaphore S, they are the only processes accessing and modifying 
S. 


To enter a critical section, a P operation is performed on a common sema- 
phore, say S (S is initialized to 1). The first processor performing P(S) will 
be able to enter its critical section. All other processors are blocked since S$ 
has become O. After leaving its critical section, the processor performs a V(S), 
thus allowing another processor to execute P(S) successfully. 


The TMS320C30 code for V(S) is shown in Example 7-7, and code for P(S) 
is shown in Example 7-8. Compare the code in Example 7-8 to the code in 
Example 7-6. 


Example 7-7. Implementation of V(S) 


V: LDII @S,R0 Interlocked read of S begins (XFO = Q) 


: Contents of S ~ RO 
ADDI 1,R0 ; Increment RO (= S) 
STII RO,@S ; Update S, end interlock (XFO = 0) 
Example 7-8. Implementation of P(S) 


Pp. OR 4,I0F 
LDII @S,RO 


; End interlock (XFO = 1) 
; Interlocked read of S begins 
; Contents of S ~ RO 
BZ P . 
SUBI 1,RO0 ; 
STII RO,@S : 


If S = 0, go to P and try again 
Decrement RO (= S) 
; Update S, end interlock (XFO = 1) 


The SIGI operation may be used to synchronize, at an instruction level, multi- 
ple TMS320C30s. Consider two processors connected as shown in Figure 
7-3 The code for the two processors is shown in Example 7-9 


TMS320C30 #1 TMS320C30 #2 


Figure 7-3. Zerc-Logic Interconnect of TMS320C30s 
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Processor #1 runs until it executes the SIGI. It then waits until processor #2 
executes a SIGI. At this point, the two processors have synchronized and 
continue execution. 


Example 7-9. Code to Synchronize Two TMS320C30s at the Software Level 


Time Code for TMS320C30 #1 Code for TMS320C30 #2 

0 ® ® 
®@ e 
®@ ® 
SIGI ® 
e 
8 
® 
(WAIT) ® 
© 
e 
® 

@ «<—— Synchronization Occurs ———® sic) 
@ é 
@  ] 
; ° 
e e 
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7.4 Reset Operation 


The TMS320C30 supports a nonmaskable external reset signal (RESET), which 
is used to perform system reset. This section discusses the reset operation. 


At powerup, the state of the TMS320C30 processor is undefined. The RESET 
signal is used to place the processor in a known state. This signal must be 
asserted low for 10 or more H1 clock cycles to guarantee a system reset. H1 
is an output clock signal generated by the TMS320C30 (see Appendix A for 
more information). 


Reset affects the other pins on the device in either a synchronous or asyn- 
chronous manner. The synchronous reset is gated by the TMS320C30s in- 
ternal clocks. The asynchronous reset directly affects the pins, and is faster 
than the synchronous reset. Table 7-3 shows the state of the TMS320C30s 
pins after RESET = 0. Each pin is described according to whether the pin is 
reset synchronously or asynchronously. 


Table 7-3. Pin Operation at Reset 


[SIGNAL [#PINS| ___OPERATIONAT RESET 
Synchronous reset. Placed in high-impedance state. 
Synchronous reset. Placed in high-impedance state. 

| R/W Td Synchronous reset. Deasserted by going to a high level. 

| STRB_ | _—t__|_ Synchronous reset. Deasserted by going to a high level._| 
BOY __| 1} Reset hes no effect 
ee ee 
| TOSTRB | _1__|_ Synchronous reset. Deasserted by going to a high level._| 
| XRDY | _ 1 | Resethasnoeffet. 
_ TACK |__| Synchronous reset. Deasserted by going to_a high level._| 
| Mc/MP_ | 1 | Resethasnoeffec. 
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Table 7-3. Pin Operation at Reset (Continued) 


SIGNAL | #PINS| OPERATION AT RESET 
SERIAL PORT 0 SIGNALS (6 PINS) 


es 
Tpxo_| 1 | Asynchronous reset. Placed in high-impedance state. 

FSx0_| 1] Asynchronous reset. Placed in high-impedance state, | 
Torro | 1 | Asynchronous reset. Placed in high-impedance state. 
[pro | 1 | Asynchronous reset. Placed in high-impedance state. 
Trsro | 1 | Asynchronous reset. Placed in high-impedance state. 
ELK | 1 | Asynchronous reset. Placed in high-impedance state. 
Trsxi | 1 | Asynchronous reset. Placed in high-impedance state. 
[pai _| 1 | Asynchronous reset. Placed in high-impedance state, 
rset | 1 | Asynchronous reset. Placed in high-impedance state, 
[TELS [1 | Asynchronous reset. Placed in high-impedance state. 

TIMER 1 SIGNAL (1 PIN) 

[ven [4 | Resethasnoefiec. OCSCSC~*S 
TPDVpp | 1 | Reset hasno effec. SSCS 
TMDVpp | 1 | Reset hasno effec. ——SOSCSC~S~—S 
Tvss(3-0) | 4 | Reset hasno effec. SSCS 
Tpvsst3-0) | 4 | Reset hasno effec. —SOSC~S~—~S~S~S 
vase | 1 | Reset hasno effec. ———SSOSOSCSC~—~S 
xt | 1 Resethas noeffect. SSCS 
[xareuK [a [rests no fee, 


Synchronous reset. Will go to its initial state when RESET 
makes a 1 to 0 transition. See Appendix A. 
Synchronous reset. Will go to its initial state when RESET 
makes a 1 to 0 transition. See Appendix A. 
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Table 7-3. Pin Operation at Reset (Concluded) 


SIGNAL # PINS OPERATION AT RESET 
EMULATION, TEST, and RESERVED (18 PINS) 


EMUO | F114 Undefined. 


EMU1 Undefined. 


EMU2 F13 Undefined. 


EMU3 Undefined. 
C 
J 
J 


EMU4 Undefined. 
EMUS 
emus | M6 | Undefined —=—S~S~—S—S 
RSVO Undefined. 
RSV1 Undefined. 
RSV2 Undefined. 
RSV3 Undefined. 
RSV4 Undefined. 
RSV5 


At system reset, the following additional operations are performed: 


@ The peripherals are reset. This is a synchronous operation. The periph- 
eral reset is described in Section 9. 


e The following CPU registers are loaded with zero: 


ST (CPU status register) 

IE (CPU/DMA interrupt enable flags) 
IF (CPU interrupt flags) 

IOF (1/0 flags) 


6 The reset vector is read from memory location Oh and loaded into the 
PC. This vector contains the start address of the system reset routine 


@ Execution begins. Refer to Section 12 an example of a processor in- 
itialization routine. 


Multiple TMS320C30s driven by the same system clock may be reset and 
synchronized. When the 1 to 0 transition of RESET occurs, the processor is 
placed on a well-defined internal phase, and all of the TMS320C30s will come 
up on the same internal phase. 
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7.5 Interrupts 


The TMS320C30 supports multiple internal and external interrupts, which can 
be used for a variety of applications. This section discusses the operation of 
these interrupts. 


A functional diagram of the logic used to implement the external interrupt in- 


puts is shown in Figure 7-4; the logic for internal interrupts is similar. Addi- 
tional information regarding internal interrupts can be found in Section 9. 


Internal 


Interrupt EINT(CPU) 
Set GIE(CPU) 
Signal interrupt 
Flag (n) To 
Control 
ae ane Set Q = ; 
INTn—jD @ D Qa Da ) > edit Section 
Processor 
CLK CLK CLK RESET -_ 
Internal 
interrupt GIE(DMA) 
H3 H1 Clear/Acknowledge 
Signal 
EINTn(DMA) 


Figure 7-4. Interrupt Logic Functional Diagram 


External interrupts are synchronized internally as illustrated by the three flip- 
flops clocked by H1 and H3. Once synchronized, the interrupt input will set 
the corresponding Interrupt Flag register (IF) bit if the interrupt is active. 


External interrupts can be effectively either edge- or level-triggered, depending 
on the duration of the low level on the interrupt input. An external interrupt 
must be held low for at least one H1/H3 cycle to be recognized by the 
TMS320C30. If the interrupt is held low for between one and three cycles, 
then only one interrupt is recognized. If the interrupt is held low three or more 
cycles, more than one interrupt may be recognized depending on how rapidly 
interrupts are serviced. 


When a particular interrupt is processed by the CPU or DMA controller, the 
corresponding interrupt flag bit is cleared by the internal interrupt acknowl- 
edge signal. It should be noted, however, that if INTn is still low when the 
interrupt acknowledge signal occurs, the interrupt flag bit will only be cleared 
for one cycle and then set again since INTn is still low. Accordingly, it is the- 
oretically possible that, depending on when the IF register is read, this bit may 
be zero even though INTn is zero. When the TMS320C30 is reset, zero is 
written to the interrupt flag register, thereby clearing all pending interrupts. 
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The interrupt flag register bits may be read and written under software control. 
Writing a 1 to a IF register bit sets the associated interrupt flag to 1. Similiarly, 
writing a O resets the corresponding interrupt flag to 0. In this way, all inter- 
rupts may be triggered and/or cleared through software. Since the interrupts 
flags may be read, the interrupt pins may be polled in software when an in- 
terrupt-driven interface is not required. 


Internal interrupts operate in a similar manner. The bit in the IF register corre- 
sponding te an internal interrupt may be read and written through software. 
Writing a 1 sets the interrupt latch, and writing a O clears it. All internal in- 
terrupts are one H1/H3 cycle in length. 


The CPU global interrupt enable bit (GIE), located in the CPU status register 
(ST), controls all CPU interrupts. All DMA interrupts are controlled by the 
DMA global! interrupt enable bit, which is not dependent upon ST(GIE) and 
is local to the DMA. The DMA global interrupt enable bit is dependent, in 
part, upon the state of the DMA SYNCH bits. It is not directly accessible 
through software (see Section 9). The AND of the interrupt flag bit and the 
interrupt enables is then connected to the interrupt processor. 


To provide for maximum performance in servicing interrupts, the interrupt ac- 
knowledge (!ACK) instruction is provided. JACK drives the JACK pin and 
performs a dummy read. The read is peformed from the address specified by 
the JACK instruction operand. When IACK is used, it typically is placed in the 
early portion of an interrupt service routine. For certain applications, it may 
be better suited at the end of the interrupt service routine or be totally unnec- 
essary. 


The CPU controls all prioritization of interrupts (see Table 7-4 for reset and 
interrupt vector locations and priorities). If the DMA is not using interrupts 
for synchronization of transfers, it will not be affected by the processing of the 
CPU interrupts. If the CPU is involved in a pipeline conflict (branch, register, 
or memory), it will not respond to the interrupts until that conflict is resolved. 
It is therefore possible to interrupt the CPU and DMA simultaneously with the 
same or different interrupts and, in effect, synchronize their activities. For ex- 
ample, it may be necessary to cause a high-priority DMA transfer that avoids 
bus conflicts with the CPU, i.e., make the DMA higher priority than the CPU. 
This may be accomplished using an interrupt that causes the CPU to trap to 
an interrupt routine that contains an IDLE instruction. Then if the same in- 
terrupt is used to synchronize DMA transfers, the DMA transfer counter can 
be used to generate an interrupt, and thus return control to the CPU following 
the DMA transfer. 


Since the DMA and CPU share the same set of interrupt flags, the DMA may 
clear an interrupt flag before the CPU can respond to it. For example, if the 
CPU interrupts are disabled, the DMA can be responding to interrupts and 
thus clearing the associated interrupt flags. 
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Table 7-4. Reset and Interrupt Vector Locations 


RESET OR VECTOR | PRIORITY FUNCTION 
INTERRUPT! LOCATION 
— i} = ¢ * oe External reset signal input on the RESET 


[Wnte [ah | 3 | External interrupt input on the INTZ pin. 
[ars [an [ext rap net on IS pn 


XINTO Internal interrupt generated when serial 
port O transmit buffer is empty. 

RINTO Internal interrupt generated when serial 
port O receive buffer is full. 

XINT1 7h 7 Internal interrupt generated when serial 
port 1 transmit buffer is empty. 

| RINT1 Internal interrupt generated when serial 

port 1 receive buffer is full. 

TINTO | ght Internal interrupt generated by timer 0. 

TINT1 Internal interrupt generated by timer 1. 


DINT OBh Internal interrupt generated by DMA con- 
troller 0. : | 


If there is a delayed branch in the pipeline, interrupts are held pending until 
after the branch. If the interrupt occurs in the first cycle of the fetch of an in- 
struction, the fetched instruction is discarded (not executed), and the address 
of that instruction is pushed to the top of the system stack. If the interrupt 
occurs after the first cycle of the fetch, in the case of a multicycle fetch due to 
wait states, that instruction is executed and the address of the next instruction 
to be fetched is pushed to the top of the system stack. If no program fetch is 
occurring, then no new fetch is performed. After the address of the appropri- 
ate instruction has been pushed, the interrupt vector is fetched, loaded into the 
PC, and execution continues. 


The TMS320C30 allows the CPU and DMA to respond to and process inter- 
rupts in parallel. Figure 7-5 shows interrupt processing flow. The interrupts 
are polled and the CPU and DMA begin processing them. In the interrupt flow 
pertaining to the the CPU, the interrupt flag corresponding to the highest- 
priority enabled interrupt is cleared, and GIE is set to 0. The CPU completes 
all fetched instructions. The interrupt vector is fetched and loaded into the PC, 
and the CPU continues execution. The DMA cycle is similar to that for the 
CPU. After the pertinent interrupt flag is cleared, the DMA proceeds based 
upon the status of the SYNCH bits in the DMA global control register. 
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IF ENABLED IF ENABLED 
INTERRUPT IS INTERRUPT IS 
A CPU INTERRUPT A DMA INTERRUPT 


CLEAR INTERRUPT FLAG 
depen CLEAR INTERRUPT FLAG 
COMPLETE ALL DMA PROCEEDS BASED 
FETCHED INSTRUCTIONS UPON SYNCH BITS 
PC —>*(+ + SP) 
FETCH INTERRUPT VECTOR _ DMA CONTINUES 
CPU CONTINUES 


Figure 7-5. Interrupt Processing 
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Section 8 


External Bus Operation 


Two external interfaces are provided on the TMS320C30: the primary bus and 
the expansion bus. These are used to access memories and external peripheral 
devices. Software controlled wait states and an external input signal provide 
for wait state generation. 


Major topics discussed in this hardware interface section are listed below. 


8 External Interface Control! Registers (Section 8.1 on page 8-2) 


- Primary bus 
= Expansion bus 


@ External Interface Timing (Section 8.2 on page 8-5) 
@ Programmable Wait States (Section 8.3 on page 8-18) 


® Programmable Bank Switching (Section 8.4 on page 8-20) 
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8.1 External Interface Control Registers 


The TMS320C30 provides two external interfaces: the primary bus and the 
expansion bus. The primary bus consists of a 32-bit data bus, a 24-bit address 
bus, and a set of control signals. The expansion bus consists of a 32-bit data 
bus, a 13-bit address bus and a set of control signals. Both buses support 
software-controlled wait states and an external ready input signal. Both buses 
support data, program, and |/O accesses. 


When a primary bus access is performed, STRB is low. The expansion bus 
supports two types of accesses. One is used primarily for memory accesses 
that are signalled by MSTRB low. The timing for a MSTRB access is the same 
as that of the STRB access on the parallel interface. The other type of expan- 
sion bus access is commonly used for access of external peripheral devices 
and is signalled by IOSTRB low. 


The primary bus and the expansion bus each have an associated control reg- 
ister. These registers are memory-mapped as shown in Figure 8-1. 


REGISTER PERIPHERAL 
ADDRESS 
20806eh 
RESERVED 80806Fh 


Figure 8-1. Memory-Mapped External Interface Control Registers 
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8.1.1 Primary Bus Control Register 


The primary bus control register is a 32-bit register that contains the control 
bits for the primary bus (see Figure 8-2). Table 8-1 lists the register bits with 
the bit names and functions. 


31 30 29 27 26 25 24 23 22 21 19 17 


15 14 13 12 #11 #10 9 


Pe Te Te — scone — [wren [sim Tas honor noua 


R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/WR/W 


NOTE: xx = reserved bit, read as 0. 
R = read, W = write. 


Figure 8-2. Primary Bus Control Register 


Table 8-1. Primary Bus Control Register Bits Summary 


Ter [Name [—~—~S~*~‘<S~SNGTIONN 


hed HOLDST | Hold status bit. This bit signals if the port is being held 


(HOLDST = 1) or is not being held (HOLDST = 0). This sta- 
NOHOLD 


tus bit is valid whether the port has been held via hardware or 


software. 


Port hold signal. NOHOLD allows or disallows the port to be 
held by an external HOLD signal. When NOHOLD = 1, the 
TMS320C30 takes over the external bus and controls it re- 
gardless of requests by external devices. No hold acknowledge 
(HOLDA) is asserted when a HOLD is received. However, it 
is asserted if an internal hold is generated (HIZ = 1). NOHOLD 
is set to O at reset. 


Internal hold. When set (HIZ = 1), the port is put in hold 
mode. This equivalent to the external HOLD signal. By forcing 
a three-state condition, the TMS320C30 can relinquish the 
external memory port through software. HOLDA goes low 
when the port is three-stated. HIZ is set to O at reset. 


Software wait-state generation. In conjunction with WTCNT, 
this 2-bit field defines the mode of wait-state generation. It is 
set to 1 1 at reset. 


Software wait mode. This 3-bit field specifies the number of 
cycles to use when in software wait mode for the generation 
of internal wait states. The range is zero (WTCNT = 00 O) to 
seven (WTCNT = 1 1 1) H1/H3 cycles. It is set to 111 at 
reset. 


Bank compare. This 5-bit field specifies the number of MSBs 


— 
of the address to be used to define the bank size. It is set to 1 


8-12 BNKCMP 
000 0 at reset. 
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8.1.2 Expansion Bus Control Register 
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The expansion bus control register is a 32-bit register that contains control 
bits for the a bus (see Figure 8-3 and Table es 


27 26 25 24 19 


14 11 
el CO 


R/W R/W R/W R/W R/W 


NOTE: xx = reserved bit, read as 0. 
R = read, W = write. 


Figure 8-3. Expansion Bus Control Register 


Table 8-2. Expansion Bus Control Register Bits Summary 


[eit | NAME | ——~S~*~< SINGIN SCS~™S 


3-4 SWW Software wait-state generation. In conjunction with the 
WTCNT, this 2-bit field defines the mode of wait-state gener- 

5-7 | WTCNT 
seven (WTCNT = 111) H1/H3 clock cycles. It is set to 1 1 
1 at reset. 


ation. It is set to 1 1 at reset. 


Software wait mode. This 3-bit field specifies the number of 
cycles to use when in software wait mode for the generation 
of internal wait states. The range is zero (WTCNT = 0 0 O) to 
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8.2 External Interface Timing 


This section discusses functional timing of operations on the primary bus and 
the expansion bus, the TMS320C30’s two independent parallel buses. De- 
tailed timing specifications for all TMS320C30 signals are contained in Ap- 
pendix A, the TMS320C30 Data Sheet. 


The parallel buses implement three mutually exclusive address spaces distin- 
quished through the use of three separate control signals: STRB,MSTRB, and 
lOSTRB. The STRB signal controls accesses on the primary bus, and the MSTRB 
and {OSTRB control accesses on the expansion bus. Since the two buses are 
independent, two accesses may be made in parallel. 


With the exception of bank switching and the external HOLD function (dis- 
cussed later in this section), timing of primary bus cycles and MSTRB expan- 
sion bus cycles are identical, and will be discussed collectively. The acronym 
(M)STRB will be used in references which pertain equally to STRB and 
MSTRB. Similarly (X)R/W, (X)A, (X)D, and (X)RDY are used to symbolize the 
equivalent primary and expansion bus signals. The !OSTRB expansion bus 
cycles are timed differently and will be discussed independently. 


8.2.1 Primary Bus Cycles 


All bus cycles comprise integral numbers of H1 clock cycles. One H1 cycle is 
defined to be from one falling edge of H1 to the next falling edge of H1. For 

full speed (zero wait state) accesses, reads take one H1 cycle, while writes a 
take two cycles, unless the write follows a read, in which case the write takes 

three cycles. Recall that internally (from the perspective of the CPU and 

DMA) writes require only one cycle if no accesses to that interface are in 
progress. The following discussions pertain to zero wait state accesses unless 
otherwise specified. 


The (M)STRB signal is low for the active portion of both reads and writes, 
which lasts one H1 cycle. Additionally before and after the active portion 
((M)STRB low) of writes only, there is a transition cycle of H1. During this 
transition cycle, the following occur: 


1) (M)STRB is high. 
2) If required, (X)R/W changes state on H1 rising. 


3) If required, addresses changes on H1 rising if the previous H1 cycle was 
the active portion of a write. If the previous H1 cycle was a read, address 
changes on H1 falling. 


Figure 8-4 illustrates a read-read-write sequence for (M)STRB active and no 
wait states. The data is read as late in the cycle as possible to allow for the 
maximum access time from address valid. Note that although external writes 
take two cycles, internally (from the perspective of the CPU and DMA), they 
require only one cycle if no accesses to that interface are in progress. In the 
typical timing for all external interfaces, the (X)R/W strobe does not change 
until (M)STRB or IOSTRB goes inactive. 
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(M)STRB 


(Xx)R/W 


(X)A 


x, 


8 Figure 8-4. Read-Read-Write for (M)STRB = 0 
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Figure 8-5 Illustrates a write-write-read sequence for (M)STRB active and no 
wait states. The address and data written are held valid approximately one- 
half cycle after (M)STRB changes. 


H1 


(M)STRB 


(X)R/W 


(X)A 


(X)D 


Figure 8-5. Write-Write-Read for (M)STRB = 0 


8-7 


External Bus Operation - External Interface Timing 


Figure 8-6 illustrates a read cycle with one wait state. Since (X)RDY = 
1, the read cycle is extended. (M)STRB, (X)R/W, and (X)A are also ex- 
tended one cycle. The next time (X)RDY is sampled, it is 0. 


(M)STRB 


(X)R/W 


wo 1a) wa» 


(X)R 


Oo 


Y 


* extra cycle el 


Figure 8-6. Use of Wait States for Read for (M)STRB = 0 
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Figure 8-7 illustrates a write cycle with one wait state. Since initially 
(X)RDY = 1, the write cycle is extended. (M)STRB, (X)R/W, and (X)A 
are extended one cycle. The next time (X)RDY is sampled, it is O. 


H3 


H1 


(M)STRB 

(X)R/W 
(X)A ! 
(X)RDY 3 3 . | : 7 : : ; 


ae cycle oa 


Figure 8-7. Use of Wait States for Write for (M)STRB = 0 


8-9 


External Bus Operation - External Interface Timing 


8.2.2 Expansion Bus [/O Cycles 
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In contrast to primary bus and MSTRB cycles, [OSTRB reads and writes are both 
two cycles in duration (with no wait states) and exhibit the same timing. 
During these cycles, address always changes on the falling edge of H1, and 
[OSTRB is low from the rising edge of the first H1 cycle to the rising edge of 
the second H1 cycle. The IOSTRB signal always goes inactive (high) between 


cycles, and XR/W is high for reads and low for writes. 


Figure 8-8 illustrates read and write cycles when IOSTRB is active and there 
are no wait states. For IOSTRB accesses, reads and writes require a minimum 
of two cycles. Some off-chip peripherals may change their status bits when 
read or written. Therefore, it is important that valid addresses be maintained 
when communicating with these peripherals. For reads and writes when 


IOSTRB is active, IOSTRB is completely framed by the address. 


(ae) 


Figure 8-8. Read and Write for IOSTRB = 0 
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Figure 8-9 illustrates a read with one wait state when JOSTRB Is active, and 
Figure 8-10 illustrates a write with one wait state when IOSTRB is active. For 


each wait state added, |OSTRB, XR/W, and XA are extended one clock cycle. 
Writes hold the data on the bus one additional cycle. The sampling of XRDY 
is repeated each cycle. 


1OSTRB 


(X)R/W 


(X)A 


(X)D 


eS extra cycle ae 


Figure 8-9. Read with One Wait-State for IOSTRB = 0 | 
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omy CNN : eo — 
| | le esize cycle _f | 


Figure 8-10. Write with One Wait-State for 1OSTRB = 0 
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Figure 8-11 through Figure 8-13 illustrate the various transitions between 
memory reads and writes, and !/O writes over the expansion bis. 


H1 


ARKO 


(X)A mem add F erererrarats ¢ vy f 
“0 («= > 


Figure 8-11. Memory Read and I/O Write for Expansion Bus 
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(X)R/W 


: Oe, 


(— mm > 


mem add 


(X)D mem write 


Figure 8-12. 1/O Write and Memory Read for Expansion Bus 
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(X)R/W 


(X)A mem add 


(X)D 


Figure 8-13. Memory Write and I/O Read for Expansion Bus 
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Figure 8-14 and Figure 8-15 illustrate the signal states when a bus is inactive 
(after a JOSTRB or (M)STRB access respectively). The strobes (STRB, MSTRB, 
IOSTRB) and (X)R/W go to 1. The address is undefined, and the ready signal 


(XRDY or RDY) is ignored. 


IOSTRB 


xRW | , : 
XA x address undefined 
(wren 


XRDY \ / XRDY ignored 
yeeeenanenes bus inactive _ 


Figure 8-14. Inactive Bus States for IOSTRB 
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(M)STRB 


(X)R/W 


(X)A xX address undefined 
mo ——_{ man 


(X)RDY \ / (X)RDY ignored 
See bus inactive sl 8 


Figure 8-15. Inactive Bus States for STRB and MSTRB 
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Figure 8-16 illustrates the timing for HOLD and HOLDA. HOLD is an external 
asynchronous input. There is a minimum of one cycle delay from when the 
processor recognizes HOLD=0 until HOLDA=0. When HOLDA=0, the address, 
data buses, and associated strobes are placed in a high-impedance state. All 
accesses occurring Over an interface are completed before a hold is acknowl- 
edged. | 


| | | [ | 
dD Write Data 
I | | | | | 


Ee Bus Inactive = 


Figure 8-16. HOLD and HOLDA Timing 
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8.3 Programmable Wait States 


Both the parallel and expansion interfaces allow the control of wait-state 
generation through the manipulation of their associated memory-mapped 
control registers. The SWW field is used to select the mode of wait-state 
generation, and the WTCNT field is used to load an internal timer used in the 
generation of wait states. The following four modes of wait-state generation 
can be used: 


e External RDY 

@ WTCNT-generated RDYwrtent 

@ Logical-AND of RDY and RDYwetent 
@ Logical-OR of RDY and RDYwrtent 


These four modes are used in the generation of the internal ready signal that 
controls accesses, RDYjn;. As long as RDYj,; = 1, the current external access 
is delayed. When RDYj,; = 0, the current access completes. Since the use of 
programmable wait states for both external interfaces is identical, only the 
primary bus interface is described in the following paragraphs. 


RDYwtent is an internally generated ready signal. When an external access is 
begun, the value in WTCNT its loaded into a counter. WTCNT may be any 
value from 0 through 7. The counter is decremented every H1/H3 clock cycle 
until it becomes 0. Once the counter is set to O, it remains set to 0 until the 
next access. While the counter is nonzero, RDYwrent = 1. While the counter 
is 0, RDY werent = 0. 


When SWW = 0 0, RDYjp; is only dependent upon RDY. RDYwrent iS ignored. 
The truth table for this mode is shown in Table 8-3. 
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Table 8-3. Wait-State Generation When SWW =00 


ROY | RDYanent_} RDVine | 


0 
1 
1 


When SWW = 01, RDYjnt is only dependent upon RDYwtent. RDY is ignored. 
Table 8-4 shows the truth table for this mode. 


Table 8-4. Wait-State Generation When SWW = 01 


When SWW = 1 0, RDYjntz is the Jogical-OR (electrical-AND, since these sig- 
nals are low true) of RDY and RDYwrent (See Table 8-5). 


Table 8-5. Wait-State Generation When SWW = 10 


| ROY | ROYawene | ROVine | 
0 
0 
1 


When SWW = 1 1, RDYjnt is the logical-AND (electrical-OR, since these sig- 
nals are low true) of RDY and RDYwtent The truth table for this mode is shown 
in Table 8-6. 


Table 8-6. Wait-State Generation When SWW = 11 
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8.4 Programmable Bank Switching 


Programmable bank switching provides the capability of switching between 
external memory banks without the need for externally inserting wait states 
due to memories requiring several cycles to turn off. Bank switching is im- 
plemented on the primary bus and not on the expansion bus. 


The size of a bank is determined by the number of bits specified to be exam- 
ined. For example (see Figure 8-17), if BNKCMP =16, the 16 MSBs of the 
address are used to define a bank. Since addresses are 24 bits, the bank size 
is specified by the 8 LSBs, yielding a bank size of 256 words in this Ease: lf 
aaa > 16, only the 16 MSBs are compared. Banksizes from 28 = 256 
to 224 = 16M are allowed. Table 8-7 summarizes the relationship between 
BNKCMP, the address bits used to define a bank, and the resulting bank size. 


ibaa 24-bit Address aaa 


23 8|7 0 
La Niunpner of bits to compare patines 
bank size 


Figure 8-17. BNKCMP Example 


Table 8-7. BNKCMP and Bank Size 


| BNKCMP | MSBs DEFINING A BANK | BANK SIZE (32-BIT WORDS) 


00000 None 

00001 23 

00010 23-22 

00011 23-21 

00100 23-20 

00101 23-19 

00110 23-18 

00111 23-17 

01000 23-16 

01001 23-15 

01010 23-14 

01011 23-13 = 8K 

01100 23-12 = 4K 

01101 23-11 = 2K 

01110 23-10 = 1K 

01111 23-9 = 512 
10000 23-8 = 256 
10001 Reserved Undefined 
through 

11111 


8-21 


Hardware Interface - Programmable Bank Switching 


8-22 


Internal to the TMS320C30 is a register that contains the MSBs (as defined 
by the BNKCMP field) of the last address used for a read or write over the 
primary interface. At reset, the register bits are set to zero. If the MSBs of the 
address being used for the current primary interface read do not match those 
contained in this internal register, a read cycle is not asserted for one H1/H3 
clock cycle. During this extra clock cycle, the address bus switches over to the 
new address, but STRB is inactive (high). The contents of the internal register 
are replaced with the MSBs being used for the current read of the current ad- 
dress. If the MSBs of the address being used for the current read match the 
bits in the register, a normal read cycle takes place. 


If repeated reads are performed from the same memory bank, no extra cycles 
are inserted. When reading from a different memory bank, memory conflicts 
are avoided by the insertion of an extra cycle. This feature can be disabled 
by setting BNKCMP to 0. The insertion of the extra cycle occurs only when a 
read is performed. The changing of the MSBs in the internal register occurs 
for all reads and writes over the primary interface. 


Figure 8-18 illustrates the addition of an inactive cycle when switches be- 
tween banks of memory occur. 
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R/W 


= cycle il 


Figure 8-18. Bank Switching Example 
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Peripherals 


The TMS320C30 provides two timers, two serial ports, and an on-chip Direct 
Memory Access (DMA) controller. These peripheral modules are manipulated 
through memory-mapped registers located on the dedicated peripheral bus. 


The DMA controller is used to perform input/output operations without in- 
terfering with the operation of the CPU. Therefore, it is possible to interface 
the TMS320C30 to slow external memories and peripherals (A/D’s, serial 
ports, etc.) without reducing the computational throughput of the CPU. The 
result is improved system performance and decreased system cost. 


Major topics discussed in this section on peripherals are listed below. 


e Timers (Section 9.1 on page 9-2) 
oo Registers 
= Pulse generation 
= Operation modes 


@ Serial Ports (Section 9.2 on page 9-9) 
= Registers 
= Operation configurations 
- Timing 
e DMA Controller (Section 9.3 on page 9-26) 
= Registers 


= VA memory transfer operation 
= Synchronization of DMA channels 
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9.1 Timers 
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The TMS320C30 timer modules are general-purpose 32-bit timer/event 
counters, with two signalling modes and internal or external clocking (see 
Figure 9-1). The timer modules can be used to signal to the TMS320C30 or 
the external world at specified intervals, or to count external events. With an 
internal clock, the timer can be used to signal an external A/D converter to 
start a conversion, or it can interrupt the TMS320C30 DMA controller to begin 
a data transfer. With an external input, the timer can count external events and 
interrupt the CPU after a specified number of events. Available to each timer 
is an 1/O pin that can be used either as an input clock to the timer, an output 
clock signal, or a general-purpose 1|/O pin. 


INTERNAL CLOCK/2 


EXTERNAL 
CLOCK 


————-—-—-—-— INV 


COUNTER REGISTER 
(31-0) 


COUNTER (32-BITS) 


PERIOD REGISTER 
(31-0) 


COMPARATOR 


? 
PERIOD = COUNTER 


PULSE GENERATOR 


TSTAT 


TIMER OUT 


Figure 9-1. Timer Block Diagram 


Three memory-mapped registers are used by each timer. They are: 


®@ Global control register 
® Period register 
@ Counter register 


The global control register determines the operating mode of the timer, moni- 
tors the timer status, and controls the function of the |/O pin of the timer. The 
period register specifies the timer’s signalling frequency. The counter register 
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contains the current value of the incrementing counter. The timer can be in- 
cremented on the rising edge or the falling edge of the input clock. The 
counter is zeroed whenever its value equals that in the period register. The 
pulse generator generates two types of external clock signals: pulse or clock. 
The memory map for the timer modules is shown in Figure 9-2. 


Register Peripheral Address 

Timer 0 Timer 1 

808024n  808034n 
808027h -808037h 


80802Ah 80803Ah 
80802Bh 80803Bh 


RESERVED 
RESERVED 


Figure 9-2. Memory-Mapped Timer Locations 


9.1.1 Timer Global Control Register 


The timer global control register is a 32-bit register that contains the global 
and port control bits for the timer module. Table 9-1 defines the register bits, 
names, and functions. Bits 3-0 are the port control bits. Bits 11-6 are the 
timer global control bits. Figure 9-3 shows the 32-bit register. Note that at 
reset, all bits are set to O except for DATIN (set to the value read on TCLK). 
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Table 9-1. Timer Global Control Register Bits Summary 
FUNC FUNC controls the function of TCLK. If FUNC = 0, TCLK is con- 
figured as a general-purpose digital 1/O port. If FUNC = 1, TCLK 
relationship between FUNC and CLKSRC. 
1/0 If FUNC = 0 and CLKSRC = 0, TCLK is configured as a general- 
purpose I/O pin. In this case, if 1/O = 0, TCLK is configured as a 
general-purpose input pin. If 1/O = 1, TCLK is configured as a 
general-purpose output pin. 
a> DATIN Data input on TCLK or DATOUT. A write has no effect. 
[a5 [Reserved] ReadasQ.SCSCSC“‘“CS*~*™S 
the timer is not held, the counter is zeroed and begins incrementing 
on the next rising edge of the timer input clock. The GO bit is 


BITS | NAME | FUNCTION 
is configured as a timer pin (see Figure 9-6) for a description of the 
2 DATOUT | DATOUT drives TCLK when in !/O port mode. DATOUT can also 
be used as an input to the timer. 
GO The GO bit resets and starts the timer counter. When GO = 1 and 
cleared on the same rising edge. GO = 0 has no effect on the timer. 
7 HLD 


Counter hold signal. When this bit is zero, the counter is disabled 
and held in its current state. If the timer is driving TCLK, the state 
of TCLK is also held. The internal divide-by-two counter is also 
held so that the counter can continue where it left off when HLD 
is set to 1. The timer registers can be read and modified while the 
timer is being held. RESET has priority over HLD. Table 9-2 shows 
the effect of writing to GO and HLD. 


Clock/Pulse mode control. When C/P = 1, clock mode is chosen 
and the signalling of the status flag and external output will have a 
50 percent duty cycle. When C/P = 0, the status flag and external 
output will be active for one H1 cycle during each timer period (see 
Figure 9-4). 


Specifies the source of the timer clock. When CLKSRC = 1, an in- 
ternal clock with frequency equal to one-half the H1 frequency is 
used to increment the counter. The INV bit has no effect on the 
internal clock source. When CLKSRC = 0, an external signal from 
the TCLK pin can be used to increment the counter. The external 
clock is synchronized internally, thus allowing external asynchro- 
nous clock sources not exceeding the specified maximum allowable 
external clock frequency. This will be less than f(H1)/2. (See Fig- 
ure 9-6 for a description of the relationship between FUNC and 
CLKSRC). 


Inverter control bit. If an external clock source is used and INV = 
1, the external clock is inverted as it goes into the counter. !f the 
output of the pulse generator is routed to TCLK and INV = 1, the 
output is inverted before it goes to TCLK (see Figure 9-1). If INV 
= Q, no inversion is performed on the input or output of the timer. 
The INV bit has no effect, regardless of its value, when TCLK is used 
in 1/0 port mode. 


CLKSRC 


This bit indicates the status of the timer. It tracks the output of the 


He 
uninverted TCLK pin. This flag sets a CPU interrupt on a 


TSTAT 
transition from 0 to 1. A write has no effect. 
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31 30 29 28 27 26 25 24 23 £22 21 = 20 19 18 17. 16 


pox proofed] oc poo | | oo | fe Pv Pv Pee 


15 14 13 12 11 


2 1 0 


R/W R/W R/W 


10 9 8 7 6 5 4 
[ax Pex pox pox] TSTAT [inv] CuxsRe [C/F] AED [GO] x | xe 
R/W R/W R/W R/W R/W R/W 


NOTE: xx = Reserved bit, read as O. 
R = read, W = write. 


3 
R 


Figure 9-3. Timer Global Control Register 


Table 9-2 shows the result of a write using specified values of the GO and HLD 
bits in the global control register. 


Table 9-2. Result of a Write of Specified Values of GO and HLD 


| GO | HLD RESULT 
| 0 | All timer operations are held. No reset is performed. 
Ee ie Timer proceeds from state before write. 


1 All timer operations are held, including zeroing of the counter. The GO 
bit is not cleared until the timer is taken out of hold. 


ae ee Timer reset and started. 


9.1.2 Timer Period and Counter Registers 


The 32-bit timer period register is used to specify the frequency of the timer 
Signalling. The timer counter register is a 32-bit register, which is reset to zero 
whenever it increments to the value of the period register. Both registers are 
set to 0 at reset. 


Certain boundary conditions affect timer operation, such as a zero in the pe- 
riod register and overflowing the counter. These conditions are listed as fol- 
lows: 


® When the period and counter registers are zero, the operation of the 
timer is dependent upon the C/P mode selected. In pulse mode (C/P = 
0), TSTAT is set and remains set. In clock mode (C/P = 1), the width 
of the cycle is 2/f(H1) and the external clocks are ignored. 


8 When the counter register is not 0 and the period register = 0, the 
counter will count, roll over to 0, and then behave as described above. 


® When the counter register is set to a value greater than the period regis- 
ter, the counter may overflow when being incremented. Once the 
counter reaches its maximum 32-bit value (OFFFFFFFFh), it simply 
clocks over to 0 and continues. 


Writes from the peripheral bus override register updates from the counter or 
new status updates to the control register. 
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9.1.3 Timer Pulse Generation 


The timer pulse generator (see Figure 9-1) can generate several different ex- 
ternal signals. These signals may be inverted with the INV bit. The two basic 
modes are pulse mode and clock mode, as shown in Figure 9-4. In both 
modes, an internal clock source has a frequency of f(H1)/2, and an external 
clock source has a maximum frequency less than f(H1)/2. Refer to timer 


timing in Appendix A. In pulse mode (C/P = 0), the width of the pulse is 
1/f(H1). 


an i 2/f(H1) 


> ae 
[ [ 


| 
— period register/f(CLKSRC)—#—_ 


bet} /f(CLKSRC) ! 


(a) TSTAT AND TIMER OUTPUT (INV = 0) WHEN C/P = 0 (PULSE MODE) 


-————— 1/f(CLKSRC) 
fet} | /#(H11) 
| i ie, 


-- period register/f(CLKSRC) —pe| | 
\q——_——_—_——— 2 x period register/f(CLKSRC) ————»! 


(b) TSTAT AND TIMER OUPUT (INV = 0) WHEN C/P = 1 (CLOCK MODE) 


Figure 9-4. Timer Timing 
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The rate of timer signaling is determined by the frequency of the timer input 
clock and the period register. The following equations are valid with either 
an internal or an external timer clock: 


f(pulse mode) = f(timer clock)/period register 
f(clock mode) = f(timer clock)/(2 x period register) 


9.1.4 Timer Operation Modes 


The timer can receive its input and send its output in several different modes, 
depending upon the setting of CLKSRC, FUNC, and {/O. The four timer 
modes of operation are defined as follows: 


lf CLKSRC = 1 and FUNC = 0, the timer input comes from the internal 
clock. The internal clock is not affected by the INV bit. In this mode, 
TCLK is connected to the |/O port control and can be used as a gener- 
al-purpose I/O pin (see Figure 9-5). If 1/O = 0, TCLK is configured as 
a general-purpose input pin whose state can be read in DATIN. DAT- 
OUT has no effect on TCLK or DATIN. If I/O = 1, TCLK is configured 
as a general-purpose output pin. DATOUT is placed on TCLK and can 
be read in DATIN. 


| 
INTERNAL | EXTERNAL 
DATOUT(NC)————-o : TCLK 
! 
\ 
| 
DATIN 
Vo = 0 
(a) 
] 
INTERNAL | EXTERNAL 
| 
DATOUT , TCLK 
| 
| 
| 
DATIN 
VO = 1 
(b) 


Figure 9-5. Timer I/O Port Configurations 


If CLKSRC = 1 and FUNC = 1, the timer input comes from the internal 
clock and the timer output goes to TCLK. This value may be inverted 
using INV, and the value output on TCLK can be read in DATIN. 


lif CLKSRC = 0 and FUNC = O, the timer is driven according to the sta- 


tus of the [/O bit. If 1/O = 0, the timer input comes from TCLK. This value 
can be inverted using INV, and the value of TCLK can be read in DATIN. 
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If f/O = 1, TCLK is an output pin. Then TCLK and the timer are both 
driven by DATOUT. All O to 1 transitions of DATOUT increment the 
counter. INV has no effect on DATOUT. The value of DATOUT can be 
read in DATIN. 


@ lf CLKSRC = 0 and FUNC = 1, TCLK drives the timer. If INV = 0, all O 
to 1 transitions of TCLK increment the counter. If INV = 1, all 1 to O 
transitions of TCLK increment the counter. The value of TCLK can be 
read in DATIN. 


Figure 9-6 shows the four timer modes of operation. 


INTERNAL ; EXTERNAL INTERNAL | EXTERNAL 


TIMER Sotaaie! TIMER 
INTERNAL 
TIMER IN CLOCK | crock | 
TIMER OUT TCLK TCLK 


TSTAT VO PORT TSTAT DATIN 
CONTROL 
CLKSRC = 1 (INTERNAL) CLKSRC = 1 (INTERNAL) 
FUNC = 0 (I/O PIN) FUNC = 1 (TIMER PIN) 
(a) (b) 


TIMER INTERNAL | EXTERNAL TIMER INTERNAL | EXTERNAL 


| TIMER IN TCLK 
TIMER OUT 


1/0 PORT TSTAT DATIN 
CONTROL 
CLKSRC = 0 (EXTERNAL) CLKSRC = 0 (EXTERNAL) 
FUNC = 0 (I/O PIN) FUNC = 1 (TIMER PIN) 
(c) (d) 


Figure 9-6. Timer Modes as Defined by CLKSRC and FUNC 


Peripherals - Serial Ports 


9.2 Serial Ports 


The two TMS320C30 serial ports are totally independent. Both serial ports 
are identical with a complementary set of control registers in each one. Each 
serial port can be configured to transfer 8, 16, 24, or 32 bits of data per word. 
The clock for each serial port can originate either internally or externally. An 
internally generated clock is a divide-down of the clockout frequency (H1). 
A continuous transfer mode is available which allows the serial port to transmit 
and receive any number of words without new synchronization pulses. 


Eight memory-mapped registers are provided for each serial port. They are: 


Port global control register 

Two port control registers for the six I/O pins 
Three port receive/transmit timer registers 
Data transmit register 

@ Data receive register 


The global control register controls the global functions of the serial port and 
determines the serial port operating mode . Two port control registers control 
the functions of the six serial port pins. The transmit buffer contains the next 
complete word to be transmitted. The receive buffer contains the last com- 
plete word received. Three additional registers are associated with the 
transmit/receive sections of the serial port timer. A serial port block diagram 
is shown in Figure 9-7, and the memory map of a serial port is shown in Figure 
9-8. 
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oe SECTION | es SECTION _ 


CLKX 
RECEIVE — CLKX | TSTAT TRANSMIT 
TIMER (16) TIMER (16) 
\_/ 
RINT RECEIVE FSX FSR TRANSMIT XINT 
CLOCK | FSX FSR | CLOCK 


BIT COUNTER BIT COUNTER 


(8/16/24/32) (8/16/24/32) 


ie YW 
LOAD LOAD 

aie. DX Dx 
CONTROL CONTROL 

DR DR 


[i€. 
= 


Figure 9-7. Serial Port Block Diagram 
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Register Peripheral Address 
Serial Serial 
Port 0 Port 1 


DATA RECEIVE REGISTER 
RESERVED 


80804Ch 80805Ch 
80804Dh 80805Dh 


RESERVED 80804Eh 80805Eh 
RESERVED 80804Fh 80805Fh 


Figure 9-8. Memory-Mapped Locations for the Serial Port 


9.2.1 Serial Port Global Control Register 


The serial port global control register is a 32-bit register that contains the 
global control bits for the serial port. Table 9-3 defines the register bits, bit 
names, and bit functions. The register is shown in Figure 9-9. 
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Table 9-3. Serial Port Global Control Register Bits Summary 


Veit | NAME FUNCTION 


RRDY lf RRDY = 1, the receive buffer has new data and is ready to be read. A three 
H1/H3 cycle delay occurs from the reading of DRR to RRDY = 1. Therising edge 
of this signal sets RINT. If RRDY = 0, the receive buffer does not have new data 
since the last read. RRDY is set to O at reset and after the receive buffer is read. 
XRDY If XRDY = 1, the transmit buffer has written the last bit of data to the shifter and 
is ready for a new word. A three H1/H3 cycle delay occurs from the loading of 
the transmit shifter to XRDY being set to 1. The rising edge of this signal sets 
XINT. If XRDY = O, the transmit buffer has not written the last bit of data to the 

transmit shifter and is not ready for a new word. XRDY is set to 1 at reset. 
a FSXOUT | This bit configures the FSX pin as an input (FSXOUT = 0) or an output (FSXOUT 
XSREMPTY 
RSRFULL 
dominate and RSRFULL is set to 1. If RSRFULL = 0, no overrun of the receiver 
has occurred. 


HS lf HS = 1, the handshake mode is enabled. If HS = O, the handshake mode is 
disabled. 


XCLKSRCE] If XCLKSRCE = 1, the internal transmit clock is used. If XCLKSRCE = 0, the 
external transmit clock is used. 


7 RCLKSRCE| If RCLKSRCE = 1, the internal receive clock is used. If RCLKSRCE = 0, the ex- 
ternal receive clock is used. 


XVAREN This bit specifies fixed (XVAREN = 0) or variable (XVAREN = 1) data rate sig- 


nalling when transmitting. With a fixed data rate, FSX is active for at least one 
XCLK cycle, and then goes inactive before transmission begins. With variable 

Pf RVAREN 

¥ XFSM 


lf XSREMPTY = 0, the transmit shift register is empty. If XSREMPTY = 1, the 
transmit shift register is not empty. This bit is set to 0 at reset or by an XRESET. 


If RSRFULL = 1, an overrun of the receiver has occurred. In continuous mode, 
RSRFULL is set to 1 when both RSR and DRR are full. In noncontinuous mode, 
RSRFULL is set to 1 when RSR and DRR are full and a new FSR is received. A 
read causes this bit to be set to 0. This bit can only be set to O by a system reset, 
a serial port receive reset (RRESET = 1), or a read. When the receiver tries to set 
RSRFULL to a1 at the same time that the global register is read, the receiver will 


ternal FSX and variable data rate signaling, the DX pin is driven by the transmitter 
when FSX is held active or when a word is being shifted out. 


This bit specifies fixed (RVAREN = 0) or variable (RVAREN = 1) data rate sig- 
nalling when receiving. With a fixed data rate, FSR is active for at least one RCLK 
cycle, and then goes inactive before the reception begins. With variable data rate, 
FSR is active while all bits are being received. 


Transmit frame sync mode. Configures the port for continuous mode operation 
(XFSM = 1) or standard mode (XFSM = 0). In continuous mode, only the first 
word of a block generates a sync pulse, and the rest are simply transmitted con- 
tinuously to the end of the block. In standard mode, each word has an associated 
sync pulse. 


Receive frame sync mode. Configures the port for continuous mode (RFSM = 
1) or standard mode (RFSM = Q) operation. In continuous mode, only the first 
word of a block generates a sync pulse, and the rest are simply received contin- 
uously without expectation of another sync pulse. In standard mode, each word 
received has an associated sync pulse. 


12 CLKXP CLKX polarity. If CLKXP = 0, CLKX is active high. If CLKXP = 1, CLKX is active 


data rate, FSX is active while all bits are being transmitted. When using an ex- 
RFSM 


low. 
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Transmit interrupt enable. If XINT = 0, the transmit interrupt is disabled. If XINT 
= 1, the transmit interrupt is enabled. Note that the CPU transmit interrupt flag 
XINT is the logical OR of the enabled transmit timer interrupt and the enabled 
transmit interrupt. 


Receive timer interrupt enable. If RTINT = 0, the receive timer interrupt is disa- 
bled. If RTINT = 1, the receive timer interrupt is enabled. 


Receive interrupt enable. If RINT = 0, the receive interrupt is disabled. If RINT 
= 1, the receive interrupt is enabled. Note that the CPU receive interrupt flag 
RINT is the OR of the enabled receive timer interrupt and the enabled receive 
interrupt 


Transmit reset. If XRESET = 0, the transmit side of the serial port is reset. To take 
the transmit side of the serial port out of reset, XRESET should be set to 1. 
However, XRESET should not be set to 1 until at least three cycles after XRESET 
goes inactive. This applies only to system reset. Setting XRESET to 0 does not 
change the contents of any of the serial port contro! registers. It places the 
transmitter in a state corresponding to the beginning of a frame of data. Resetting 
the transmitter generates a transmit interrupt. This bit should be set at the same 
time the mode of the transmitter is set. XFSM can be toggled without resetting 
the global control register. 


Receive reset. If RRESET = 0, the receive side of the serial port is reset. To take 
the transmit side of the serial port out of reset, XRESET should be set to 1. Set- 
ting RRESET to O does not change the contents of any of the serial port control 
registers. It places the receiver in a state corresponding to the beginning of a 


Table 9-3. Serial Port Global Control Register Bits Summary (Concluded) 
13 CLKRP CLKR polarity. If CLKRP = 0, CLKR is active high. If CLKRP = 1, CLKR is active 
low. 
DR polarity. If DRP = 0, DR is active high. If DRP = 1, DR is active low. 
FSXP FSX polarity. If FSXP = 0, FSX is active high. If FSXP = 1, FSX is active low. 
18-19] XLEN This bit defines the word length of serial data transmitted. All data is assumed 
to be right-justified in the transmit buffer when fewer than 32 bits are specified. 
01 --- 16 bits 11 --- 32 bits 
20-21} RLEN This bit defines the word length of serial data received. All data is right-justified 
0 O --- 8 bits 1 0 --- 24 bits 
01 --- 16 bits 1 1 --- 32 bits 
abled. If XTINT = 1, the transmit timer interrupt is enabled. 
XINT 
RTINT 
XRESET 
27 RRESET 
frame of data. This bit should be set at the same time the mode of the receiver 
is set. RFSM can be toggled without resetting the global contro! register. 


DX polarity. If DXP = 0, DX is active high. If DXP = 1, DX is active low. 
FSRP FSR polarity. If FSRP = 0, FSR is active high. If FSRP = 1, FSR is active low. 
0 0 --- 8 bits 1 0 --- 24 bits 
in the receive buffer. 
XTINT Transmit timer interrupt enable. If XTINT = 0, the transmit timer interrupt is dis- 
RINT 
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31 30 29 28 27 26 24 23 2 21 20 19 18 


25 2 17 16 
px | x |x | RINT 


R/W R/W R/W R/W- R/W R/W- R/WR/WOR/W_séR/W_sR/W_)siR/W 
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
DXP {|CLKRP|CLKXPIRFSMIXFSM{[RVARENIXVAREN| RCLKI XCLKIHS|RSR{ XSR_ {[FSXOUTIXRDYIRRDY 
SRCE|SRCE FULL] EMPTY 
R/W R/W R/W R/W-) R/W_sR/W_ R/W R/W R/W_ R/W R/W R R R/W R R 


NOTE: xx =Reserved bit, read as 0. 
R = read, W = write. 


Figure 9-9. Serial Port Global Control Register 


9.2.2 FSX/DX/CLKX Port Control Register 


This 32-bit port control register controls the function of the serial port FSX, 
DX, and CLKX pins. At reset, all bits are set to 0. Table 9-4 defines the register 
bits, bit names, and functions. Figure 9-10 shows this port control register. 


Table 9-4. FSX/DX/CLKX Port Control Register Bits Summary 

CLKXFUNC CLKXFUNC controls the function of CLKX. if CLKXFUNC 
= 0, CLKX is configured as a general-purpose digital 1/O 
port. {f CLKXFUNC = 1, CLKX is a serial port pin. 

1 CLKXI/O If CLKXI/O = 0, CLKR is configured as a general-purpose 
input pin. If CLKXI/O = 1, CLKX is configured as a gener- 
al-purpose output pin. 

CLKXDATOUT] Data output on CLKX. 
CLKXDATIN | Data input on CLKX. A write has no effect. 

4 DXFUNC DXFUNC controls the function of DX. If DXFUNC = 0, DX 
is configured as a general-purpose digital 1/O port. If 
DXFUNC = 1, DX is a serial port pin. 

DxI/O If DXI/O = 0, DX is configured as a general-purpose input 
pin. If DXI/O = 1, DX is configured as a general-purpose 
output pin. 

Eee DXDATOUT Data output on DX. 
DXDATIN Data input on DX. A write has no effect. 

FSXFUNC FSXFUNC controls the function of FSX. If FSXFUNC = 0, 
FSX is configured as a general-purpose digital I/O port. If 
FSXFUNC = 1, FSX is a serial port pin. 

FSXI/O lf FSX1/O = 0, FSX is configured as a general-purpose input 
pin. If FSX!I/O = 1, FSX is configured as a general-purpose 
output pin. 

FSXDATOUT | Data output on FSX. 
FSXDATIN Data input on FSX. A write has no effect. 
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31 30 29 28 27 21 17 
P11 Sa a A 


16 14 13 12 
FSX FSX FSX | FSX DX DX DX }| DX | CLKX | CLKX |CLKX}|CLKX 
DATIN| DATOUT] 1/0 [FUNC] DATIN|DATOUT] [/0 |FUNC] DATIN|DATOUT| 1/0 |FUNC 
R R/W R/W_ R/W R/W R/W_ R/W R/W R/W_R/W 


NOTE: xx =Reserved bit, read as 0. 
R = read, W = write. 


Figure 9-10. FSX/DX/CLKX Port Control Register 


9.2.3 FSR/DR/CLKR Port Control Register 


This 32-bit port control register is controlled by the function of the serial port 
FSR, DR, and CLKR pins. At reset, all bits are set to 0. Table 9-5 defines the 
register bits, the bit names, and functions. Figure 9-11 illustrates this port 
control register. 


Table 9-5. FSR/DR/CLKR Port Control Register Bits Summary 

pit | NAME FUNCTION 

CLKRFUNC CLKRFUNC controls the function of CLKR. If CLKRFUNC 
= 0, CLKR is configured as a general-purpose digital |/O 
port. If CLKRFUNC = 1, CLKR is a serial port pin. | 

CLKRI/O lf CLKRI/O = 0, CLKR is configured as a general-purpose 
input pin. If CLKRI/O = 1, CLKR is configured as a gener- 
al-purpose output pin. 

I | CLKRDATOUT] Data output on CLKR. 
| 3 | CLKRDATIN | Data input on CLKR. A write has no effect. | 

DRFUNC DRFUNC controls the function of DR. If DRFUNC = 0, DR 
is configured as a general-purpose digital 1/O port. If 
DRFUNC = 1, DR is a serial port pin. 

DRI/O if DRI/O = 0, DR is configured as a general-purpose input 
pin. If DRi/O = 1, TCLK is configured as a general-purpose 
output pin. 

| 6 | DRDATOUT | Data output on DR. 
DRDATIN Data input on DR. A write has no-effect. 

FSRFUNC FSRFUNC controls the function of FSR. If FSRFUNC = 0, 
FSR is configured as a general-purpose digital |/O port. If 
FSRFUNC = 1, FSR is a serial port pin. 

FSRI/O lf FSRI/O = 0, FSR is configured as a general-purpose input 
pin. If FSRI/O = 1, FSR is configured as a general-purpose 
output pin. 

| 10 | FSRDATOUT | Data output on FSR. 
FSRDATIN Data input on FSR. A write has no effect. 
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31 30 29 28 27 
pitta ta tetera sa ot ie 


15 14 13 12 11 
FSR FSR FSR {| FSR DR DR DR | DR {| CLKR | CLKR {|CLKR{ICLKR 
DATIN | DATOUT] 1/0 [FUNC] DATIN|DATOUT] 1/0 [FUNC] DATIN|DATOUT] 1/0 [FUNC 
R/W R/WR/W R/W  R/W_ R/W R/W R/W R/W 


NOTE: xx =Reserved bit, read as 0. 
R = read, W = write. 


Figure 9-11. FSR/DR/CLKR Port Control Register 


9.2.4 Receive/Transmit Timer Control Register 


A 32-bit receive/transmit timer control register contains the control bits for the 
timer module. At reset, all bits are set to 0. Table 9-6 lists the register bits, bit 
names, and functions. Bits 5-0 control the transmitter timer. Bits 11-6 control 
the receiver timer. Figure 9-12 shows the register. 
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Table 9-6. Receive/Transmit Timer Control! Register 


| BIT | NAME FUNCTION 


The XGO bit resets and starts the transmit timer counter. When 
XGO is set to 1 and the timer is not held, the counter is zeroed 
and begins incrementing on the next rising edge of the timer 
input clock. The XGO bit is cleared on the same rising edge. 
Writing 0 to XGO has no effect on the transmit timer. 


Transmit counter hold signal. When this bit is set to 0, the 
counter is disabled and held in its current state. The internal 
divide-by-two counter is also held so that the counter will 
continue where it left off when XHLD is set to 1. The timer 
registers may be read and modified while the timer is being 
held. RESET has priority over XHLD. 


XClock/Pulse mode control. When XC/P = 1, the clock mode 
is chosen. The signalling of the status flag and external output 
has a 50-percent duty cycle. When XC/P = 0, the status flag 
and external output are active for one CLKOUT cycle during 
each timer period. 


This bit specifies the source of the transmit timer clock. When 

XCLKSRC = 1, an internal clock with frequency equal to one- 
half the CLKOUT frequency is used to increment the counter. 
When XCLKSRC = 0, an external signal from the CLKX pin can 
be used to increment the counter. The external clock source is 
synchronized internally, thus allowing for external asynchro- 
nous clock sources that do not exceed the specified maximum 


3 XCLKSRC 
allowable external clock frequency, i.e., less than f(H1)/2.6. 


| 4 | Reserved ‘Read as zero. 


ae XTSTAT This bit indicates the status of the receive timer. It tracks what 


would be the output of the uninverted CLKX pin. This flag sets 
a CPU interrupt on a transition from 0 to 1. A write has no 
effect. 


The RGO bit resets and starts the receive timer counter. When 
RGO is set to 1 and the timer is not held, the counter is zeroed 
and begins incrementing on the next rising edge of the timer 
input clock. The RGO bit is cleared on the same rising edge. 
Writing 0 to RGO has no effect on the receive timer. 


Receive counter hold signal. When this bit is set to 0, the 
counter is disabled and held in its current state. The internal 
divide-by-two counter is also held so that the counter will 
continue where it left off when RHLD is set to 1. The timer 
registers may be read and modified while the timer is being 
heid. RESET has priority over RHLD. 
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Table 9-6. Receive/Transmit Timer Control Register (Concluded) 


This bit specifies the source of the receive timer clock. When 
RCLKSRC = 1, an internal clock with frequency equal to one- 
half the CLKOUT frequency is used to increment the counter. 
When RCLKSRC = 0, an external signal from the CLKR pin can 
be used to increment the counter. The external clock source is 


| BIT | NAME FUNCTION 
has a 50-percent duty cycle. When RC/P = 0, the status flag 
synchronized internally, thus allowing for external asynchro- 
[10 | Resewed | Readaszro, 
a CPU interrupt on a transition from 0 to 1. A write has no 
31 30 29 28 27 25 23 21 19 17 
misieia RTSTAT eS RCLKSRC | RC/P| RHLD a XTSTAT a XCLKSRC |xC/ P| XHLD 


RC/P RClock/Pulse mode control. When RC/P = 1, the clock mode 
and external output are active for one CLKOUT cycle during 
nous clock sources that do not exceed the specified maximum 

RTSTAT This bit indicates the status of the receive timer. It tracks what 
effect. 

R/W R/W R/W R/W- R/W 
NOTE: xx =Reserved bit, read as 0. 


is chosen. The signalling of the status flag and external output 
each timer period. 
RCLKSRC 
allowable external clock frequency, i.e., less than f(H1)/2.6. 
would be the output of the uninverted CLKR pin. This flag sets 
(12-31 | Reserved [| Readaso 
14 13 12 11 
R = read, W = write. 


Figure 9-12. Receive/Transmit Timer Control Register 


9.2.5 Receive/Transmit Timer Counter Register 


The timer counter register is a 32-bit register (see Figure 9-13). Bits 15-0 are 
the transmit timer counter, and bits 31-16 are the receive timer counter. Each 
counter is set to O whenever it increments to the value of the counter. It is also 
set to O at reset. 


31 16 
RECEIVE COUNTER 
15 0 


TRANSMIT COUNTER 


NOTE: All bits are read/write. 


Figure 9-13. Receive/Transmit Timer Counter Register 
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9.2.6 Receive/Transmit Timer Period Register 


The timer period register is a 32-bit register (see Figure 9-14) Bits 15-0 are 
the timer transmit period, and bits 31-16 are the receive period. Each register 
is used to specify the period of the timer. It is also set to O at reset. 


31 16 
RECEIVE PERIOD 
15 0 


TRANSMIT PERIOD 


NOTE: All bits are read/write. 


Figure 9-14. Receive/Transmit Timer Period Register 


9.2.7 Data Transmit Register 


When the data transmit register (DXR) is loaded, the transmitter loads the 
word into the transmit shift register (XSR), and the bits are shifted out. The 
delay from a write to DXR until an FSX occurs (or can be accepted) is two 
CLKX cycles. The word is not loaded into the shift register until the shifter is 
empty. When DXR is loaded into XSR, the XRDY bit is set, specifying that the 
buffer is available to receive the next word. Four tap points within the transmit 
shift register are used to transmit the word. These tap points correspond to 
the four data word sizes and are illustrated in Figure 9-15. The shift Is a left- 
shift (LSB to MSB) with the data shifted out of the MSB corresponding to the 
appropriate tap point. 


< shift direction 


31 24 23 16 15 8 7 0 
32-bit 24-bit 16-bit 8-bit 
word tap word tap word tap word tap 


Figure 9-15. Transmit Buffer Shift Operation 


9.2.8 Data Receive Register 


When serial data is input, the receiver shifts the bits into the receive shift reg- 
ister (RSR). When the specified number of bits are shifted in, the data receive 
register (DRR) is loaded from RSR and the RRDY status bit is set. The receiver 
is double-buffered. If the DRR has not been read and the RSR is full, the re- 
ceiver is frozen. New data coming into the DR pin is ignored. The receive 
shifter will not write over the DRR. The DRR must be read to allow new data 
in the RSR to be transferred to the DRR. When a write to DRR occurs at the 
same time that a RSR to DRR transfer takes place, the RSR to DRR transfer 
has priority. 


Data is shifted to the left (LSB to MSB). Figure 9-16 illustrates what happens 
when words less than 32 bits are shifted into the serial port. In this figure, it 
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is assumed that an 8-bit word is being received and that the upper three bytes 
of the receive buffer are originally undefined. In the first portion of the figure, 
byte a has been shifted in. When byte b is shifted in, byte a is shifted to the 
left. When the data receive register is read, both bytes a and b are read. 

= shift direction 


31 24 23 16 15 8 


7 0 
anersyeb [x] x. ~~ 


Figure 9-16. Receive Buffer Shift Operation 


9.2.9 Serial Port Operation Configurations 
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Several configurations are provided for the operation of the serial port clocks 
and timer. The clocks for each serial port can originate either internally or ex- 
ternally. Figure 9-17 shows serial port clocking in the 1/O mode (FUNC = 0) 
when CLKxX is either an input or an output. Figure 9-18 shows clocking in the 
serial port mode (FUNC = 1). Both figures use a transmit section for an ex- 
ample. The same relationship holds for a receive section. 
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INTERNAL ; EXTERNAL INTERNAL | EXTERNAL 
| 
| 


| 
TIMER IN CLOCK TIMER IN 


) CLKX 
DATOUT DATOUT 
DATIN DATIN 


CLKX 


FUNC = 0 (I/O MODE) FUNC = O (I/O MODE) 
CLKXI/O = 1 (CLKX, AN INPUT) CLKXI/O = 1 (CLKX, AN OUTPUT) 
XCLKSRC = 1 (INTERNAL CLK FOR TIMER) XCLKSRC = O (EXTERNAL CLK FOR TIMER) 
(a) (b) 
INTERNAL | EXTERNAL INTERNAL l EXTERNAL 


[ 

TSTAT TSTAT 
INTERNAL | 
TIMER IN CLOCK | TIMER IN l 


xsR La cuKx 


DATOUT (NC) —-—O DATOUT (INC) ——O 


DATIN DATIN 
FUNC = 0 (I/O MODE) FUNC = O (i/O MODE) 
CLKXI/O = 1 (CLKX, AN INPUT) CLKXI/O = O (CLKX, AN INPUT) 
XCLKSRC = 1 (INTERNAL CLK FOR TIMER) XCLKSRC = O (EXTERNAL CLK FOR TIMER) 
(c) (d) 


Figure 9-17. Serial Port Clocking in 1/O Mode 
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TSTAT 


DATOUT (NC) ——© 


DATIN 


FUNC 
XCLKSRCE 
XCLKSRC 
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INTERNAL | EXTERNAL INTERNAL | EXTERNAL 


| TSTAT 
INTERNAL INTERNAL 
CLOCK | <— timer} crock 

| 


| 
CLKX CLKX 
| > xen] C | 


oe ee ee ee eee 


INV DATOUT (NC)-——-O 
DATIN INV 
FUNC = 1 (SERIAL PORT MODE) 
1 (SERIAL PORT MODE) XCLKSRCE = 0 (INPUT SERIAL PORT CLK) 
1 (OUTPUT SERIAL PORT CLK) XCLKSRC = 1 (INTERNAL CLK FOR TIMER) 
0 OR 1 (b) 


(a) 


TSTAT INTERNAL | EXTERNAL 


INV 
DATOUT (NC) © 


DATIN 
FUNC = 1 (SERIAL PORT MODE) 
XCLKSRCE = 0 (INPUT SERIAL PORT CLK) 
XCLKSRC = O (EXTERNAL CLK FOR TIMER) 


(c) 


Figure 9-18. Serial Port Clocking in Serial Port Mode 
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9.2.10 Serial Port Timing 


The formula for calculating the frequency of the serial port clock with an in- 
ternally generated clock is dependent upon the operation mode of the serial 
port timers, defined as: 


f (pulse mode) = f (timer clock) /period register 
f (clock mode) = f (timer clock)/(2 x period register) 


An externally generated serial port clock (CLKX or CLKR) has a maximum 
frequency less than f(H1)/2.6. See serial port timing in Appendix A. 


Transmit data is clocked out on the rising edge of the selected serial port clock. 
Receive data is latched into the receive shift register on the falling edge of the 
serial port clock. All data is transmitted and loaded MSB first and right-justi- 
fied. If less than 32 bits are transferred, the data is right-justified in the 32-bit 
transmit and receive buffers. Therefore, the LSBs of the transmit buffer are the 
bits that are transmitted. 


The transmit ready (XRDY) signal specifies that the data transmit register 
(DXR) is available to be loaded with new data. XRDY goes active as soon 
as the data is loaded into the transmit shift register (XSR). The last word may 
still be shifting out when XRDY goes active. If DXR is loaded before the last 
word has completed transmission, the data bits transmitted will be consec- 
utive, i.e., the LSB of the first word immediately precedes the MSB of the 
second, with all signalling valid as in two separate transmits. XRDY goes in- 
active when DXR is loaded, and remains inactive until the data is loaded into 
the shifter. 


The receive ready (RRDY) signal is active as long as a new word of data is 
loaded into the data receive register and has not been read. As soon as the 
data is read, the RRDY bit is turned off. 


When FSX is specified as an output, the activity of the signal is determined 
solely by the internal state of the serial port. When a fixed data rate is speci- 
fied, FSX goes active when DXR is loaded into XSR to be transmitted out. 
One serial clock cycle later, FSX turns inactive and data transmission begins. 
When a variable data rate is specified, the FSX pin is activated when the data 
transmission begins, and remains active during the entire transmission of the 
word. Again, the data is transmitted one clock cycle after it is loaded into the 
data transmit register. 


An input FSX in the fixed data rate mode should go active for at least one se- 
rial clock cycle and then inactive to initiate the data transfer. The transmitter 
then transmits the number of bits specified by the LEN bits. In the variable 
data rate mode, the transmitter begins transmitting as soon as FSX goes active 
until the number of specified bits has been shifted out. In the variable data 
rate mode, when the FSX status changes prior to all the data bits being shifted 
out, the transmission completes and the DX pin is placed in a high impedance 
state. An FSR input is exactly complementary to the FSX. 


When using an external FSX, if DXR and XSR are empty, a write to DXR re- 
sults in a DXR to XSR transfer. This data is held in the XSR until an FSX oc- 
curs. When the external FSX is received, the XSR begins shifting the data. If 
XSR is waiting for the external FSX, a write to DXR will change DXR, but a 
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DXR to XSR transfer will not occur. XSR begins shifting when the external 
FSX is received, or when reset using XRESET. 


Continuous Transmit and Receive Modes 


When continuous mode is chosen, consecutive writes do not generate or ex- 
pect new sync pulse signalling. Only the first word of a block begins with an 
active synchronization. Thereafter, data continues to be transmitted as long 
as new data is loaded into DXR before the last word has been transmitted. 
As soon as TXRDY is active and all of the data has been transmitted out of the 


shift register, the DX pin is placed in a high impedance state, and a subsequent 


write to DXR initiates a new block and a new FSX. 


Similarly with FSR, the receiver continues shifting in new data and loading 
DRR. If the data receive buffer is not read before the next word is shifted in, 
subsequent incoming data will be lost. The RFSM bit can be used to termi- 
nate the receive continuous mode. 
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Handshake Mode 


The handshake mode (HS = 1) allows for direct connection between proces- 
sors. In this mode, all data words are transmitted with a leading 1 (see Figure 
9-19). For example, if an 8-bit word is to be transmitted, the first bit sent is 
a1, followed by the 8-bit data word. 


In this mode, once the serial port transmits a word, it will not transmit another 
word until it receives a separately transmitted zero bit. Therefore, the 1 bit that 
precedes every data word is, in effect, a request bit. 


' data word 
(8-bits) 


he leading one 


Figure 9-19. Data Word Format in Handshake Mode 


After a serial port receives a word (with the leading 1), and it has been read 
from the DRR, it sends a single O to the transmitting serial port. Thus, the 
single 0 bit acts as an acknowledge bit (see Figure 9-20). This single ac- 
knowledge bit is sent every time the DRR is read, even if the DRR does not 


contain new data. 
\ ene zero 


Figure 9-20. Single Zero Sent as an Acknowledge 


When the serial port is placed in the handshake mode, the insertion and de- 
letion of a leading 1 for transmitted data, the sending of a O for acknowl- 
edgement of received data, and the waiting for this acknowledge bit are all 
performed automatically. Using this scheme, it is simple to connect processors 
with no external hardware and guarantee secure communication. A typical 
configuration is shown in Figure 9-21. 


In the handshake mode, FSX is automatically configured as an output. Con- 
tinuous mode is automatically disabled. After a system reset or XRESET, the 
transmitter is always permitted to transmit. The transmitter and receiver must 
be reset when entering the handshake mode. 
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TMS320C30 #1 TMS320C30 #2 


Figure 9-21. Direct Connection Using Handshake Mode 


9.2.11 Serial Port interrupt Sources 


A serial port has four interrupt sources: 


1) The transmit timer interrupt: The rising edge of XTSTAT causes a single 
cycle interrupt pulse to occur. When XTINT is 0, this interrupt pulse is 
disabled. 


2) The receive timer interrupt: The rising edge of RTSTAT causes a single 
cycle interrupt pulse to occur. When RTINT is 0, this interrupt pulse is 
disabled. 


3) The transmitter interrupt: Occurs immediately following a DXR to XSR 
transfer. The transmitter interrupt is a single cycle pulse. When the 
global serial-port contro! register XINT is O, this interrupt pulse is disa- 
bled. 


4) The receiver interrupt: Occurs immediately following a RSR to DRR 
transfer. The receiver interrupt is a single cycle pulse. When the global 
serial-port control register RINT is O, this interrupt pulse is disabled. 


The transmit timer interrupt pulse is ORed with the transmitter interrupt pulse 
to create the CPU transmit interrupt flag XINT. The receive timer interrupt 
pulse is ORed with the receiver interrupt pulse to create the CPU receive in- 
terrupt flag RINT. 


9.2.12 Serial Port Functional Operation 
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The following paragraphs and figures illustrate the functional timing of the 
various serial port modes of operation. The timing descriptions are presented 
assuming that all signal polarities are configured to be positive, i.e. 
CLKXP=CLKRP= DXP=DRP=FSXP=FSRP=0. Logical timing, in situations 
where one or more of these polarities are inverted, is the same but with respect 
to the opposite polarity reference points, i.e. rising vs. falling edges, etc. 


These discussions pertain to the numerous operating modes and configura- 
tions of the serial port logic. When it is necessary to switch operating modes 
or change configurations of the serial port, this should be done only when 
XRESET or or RRESET are asserted (low) as appropriate. Therefore, when 
transmit configurations are modified, XRESET should be low, and when re- 
ceive configurations are modified, RRESET should be low. When in handshake 
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mode, however, since the transmitter and receiver are interrelated, any con- 
figuration changes should be made with XRESET and RRESET both low. 


All of the various serial port operating configurations can be broadly classified 
in two categories: fixed data rate timing and variable data rate timing. The 
following paragraphs discuss fixed and variable data rate operation and all of 
their variations. 


Fixed Data Rate Timing Operation 


Fixed data rate serial port transfers can occur in two varieties: burst mode and 
continuous mode. In burst mode operation, transfers of single words are se- 
parated by periods of inactivity on the serial port. In continuous mode, there 
are no gaps between successive word tranfers, i.e., the first bit of a new word 
is transferred on the next CLKX/R pulse following the last bit of the previous 
word. This occurs continuously until the process is terminated. 


In burst mode with fixed data rate timing, FSX/FSR pulses initiate transfers, 
and each transfer involves a single word. With an internally generated FSX 
(see Figure 9-22), transmission is initiated by loading DXR. In this mode, 
there is an approximately 2.5 CLKX cycle delay (depending on CLKX and H1 
frequencies) from DXR being loaded until FSX occurs. With an external FSX, 
the FSX pulse initiates the transfer and the 2.5 cycle delay effectively becomes 
a setup requirement for loading DXR with respect to FSX. Therefore, in this 
case, DXR must be loaded no later than 3 CLKX cycles before FSX occurs. 
Once the XSR is loaded from the DXR, an XINT is generated. 


cxm® LLP LP LL LS LS OLS LE 
rr 


KAAAAAARAKARAK AY 


FSR/ FSX rere Mareteeloteet'e 
(EXTERNAL) 
FSX | | 
(INTERNAL) 
DXIDR =—————————————-~-~-~-~-{ XX) - == == == 
DXR XINT RINT 
LOADED 


Figure 9-22. Fixed Burst Mode 


In receive operations, once a transfer is initiated, FSR is ignored until the last 
bit. For burst mode transfers, FSR must be low during the last bit, or another 
transfer will be initiated. After a full word has been received and transfered to 
the DRR, an RINT is generated. 


In fixed data rate mode, continuous transfers may be performed even if 
R/XFSM=0 as long as properly timed frame synchronization is provided, or if 
DXR is reloaded each cycle (with an internally generated FSX), see Figure 
9-23. 
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FSR/FSX | | | | 
(EXTERNAL) 


XINT XINT | XINT 
DXR DXR RINT RINT | 
LOADED LOADED LOAD DXR LOAD DXR 
READ DRR READ DRR 


Figure 9-23. Fixed Continuous Mode With Frame Synch 


For receive operations and with externally generated FSX, once transfers have 
begun, frame sync pulses are only required during the last bit transferred to 
initiate another contiguous transfer. Otherwise, frame sync inputs are ignored. | 
Therefore, continuous transfers will occur if frame sync is held high. With an 
internally generated FSX, there is an approximately 2.5 CLKX cycle delay fol- 
lowing DXR being loaded before FSX occurs. This delay occurs each time 
DXR is loaded, therefore, during continuous transmission, the instruction 
which loads DXR must be executed by the N-3 bit, (for an N-bit trans- 
mission). Since delays due to pipelining may vary, a conservative margin of 
safety should be incorporated in accounting for this delay. 


Once the process begins, an XINT and an RINT are generated at the beginning 
of each transfer. The XINT indicates that the XSR has been loaded from DXR, 
and can be used to cause DXR to be reloaded. To maintain continuous trans- 
mission in this mode, especially with an interally generated FSX, DXR must 
be reloaded early in the ongoing transfer. 


The RINT indicates that a full word has been received and transferred into the 
DRR. RINT is therefore commonly used to indicate an appropriate time to read 
DRR. 


Continuous transfers are terminated by discontinuing frame sync pulses or, in 
the case of internally generated FSX, not reloading DXR. 


Continuous serial port transfers can be accomplished without the use of frame 
sync pulses if R/XFSM are set to one. In this mode, operation of the serial 
port is similar to continuous operation with frame sync except that a frame 
sync pulse is involved only in the first word transferred, and no further frame 
sync pulses are used. Following the first word transferred (see Figure 9-24), 
no internal frame sync pulses are generated, and frame sync inputs are ig- 
nored. Additionally, R/XFSM should be set prior to or during the first word 
transferred, and must be set no later than the transfer of the N-1 bit of the first 
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word, except for transmit operations. For transmit operations in the fixed data 
rate mode, XFSM must be set no later than the N-2 bit. Clearing R/XFSM 
must be performed no later than the N-1 bit to be recognized in the current 
cycle. 


CLKX/R | | | | | | | J LJ L_J L_J LI LJ L_J 


FSR/FSX 
(EXTERNAL) 


OO OO a0) COO OOCTICOTETESO CE VOODOO SOO ES YY KOK OOOO OOOO) .. OOO) 
(INTERNAL) 
ROX ———————-—~—---------- 


! { 
DXR XINT SET XINT XINT 
LOADED R/XFSM RINT RINT 
DXR LOAD DXR LOAD DXR 
LOADED READ DRR READ DRR 


Figure 9-24. Fixed Continuous Mode Without Frame Synch 


Timing of RINT and XINT and data transfers to and from DXR and DRR, re- 
spectively, are the same as in fixed data rate continuous mode with frame sync. 
This mode of operation also exhibits the same 2.5 CLKX cycle delay following 
DXR being loaded before an internal FSX is generated. As in the case of 
continuous operation in fixed data rate mode with frame sync, DXR must be 
reloaded no later than transmission of the N-3 bit. 


When using continuous operation in fixed data rate mode, R/XFSM may be 
set and cleared as desired, even during active transfers, to enable or disable the 
use of frame sync pulses as dictated by system requirements. Under most 
conditions, the effect of changing the state of R/XFSM occurs during the 
transfer in which the R/XFSM change was made, provided the change was 
made early enough in the transfer. For transmit operations with internal FSX 
in fixed data rate mode, however, a one word delay occurs before frame sync 
pulse generation resumes when clearing XFSM to zero (see Figure 9-25). 
Therefore, one additional word is transferred in this case before the next FSX 
pulse is generated. Also note that, as discussed previously, clearing XFSM 
will be recognized during the current word being transmitted as long as XFSM 
is cleared no later than the N-1 bit. Setting XFSM is recognized as long as 
XFSM is set no later than the N-2 bit. 
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| | | | | | 
| | 

1ST WORD | 2ND WORD ! 3RD WORD | 4TH WORD ; 5TH WORD | 
I | | 


FSX 
(INTERNAL) 
a CO COE CXEXED COGN CEE XET 
LOAD at ge 
DXR XFSM XFSM 


Figure 9-25. Exiting Fixed Continuous Mode Without Frame Synch, FSX Internal 


Variable Data Rate Timing Operation 


Variable data rate timing also supports operation in either burst or continuous 
mode. Burst mode operation with variable data rate timing is similar to burst 
mode operation with fixed data rate timing. With variable data rate timing (see 
Figure 9-26) however, FSX/R and data timing differs slightly at the beginning 
and end of transfers. Specifically, there are three major differences between 
fixed and variable data rate timing. 


coor LSS ESL 


FSR/FSX 
(EXTERNAL) 


Bs REE RR RY RED 
FSX TSS (ceca aan ee ee 
(INTERNAL) 


DXR XINT RINT 
LOADED 


Figure 9-26. Variable Burst Mode 


First, FSX/R pulses typically last for the entire transfer interval, although FSR 
and external FSX are ignored after the first bit transferred. FSX/R pulses in 
fixed data rate mode typically last only one CLKX/R cycle, but can last longer. 


Second, data transfer begins during the CLKX/R cycle in which FSX/R occurs, 
rather than the CLKX/R cycle following FSX/R, as is the case with fixed data 
rate timing. 
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Finally, with variable data rate timing, frame sync inputs are ignored until the 
end of the last bit transferred, rather than the beginning of the last bit trans- 
ferred as is the case with fixed data rate timing. 


When transmitting continuously in variable data rate mode with frame sync, 
timing is the same as for fixed data rate mode, besides the differences between 
these two modes as described under burst mode operation with variable data 
rate timing. The only exception to this is that when operating continuously 
in variable data rate mode (see Figure 9-27), DXR must be reloaded no Jater 
than the N-4 bit to maintain continuous operation, as opposed to the N-3 bit 
for fixed data rate mode. 


FSR/FSX 
(EXTERNAL) 


DXIDR (it X XE 


sc mm 8 pe pt a ry A A I 


DXR XINT XINT | XINT | 
LOADED RINT RINT 
LOAD LOAD DXR LOAD DXR 
DXR READ DRR READ DRR 


Figure 9-27. Variable Continuous Mode With Frame Synch 


Continuous operation in variable data rate mode without frame sync is also 
similar to continuous operation without frame sync in fixed data rate mode. 
As with variable data rate mode continuous operation with frame sync (see 
Figure 9-28), DXR must be reloaded no later than the N-4 bit to maintain 
continuous operation. Additionally, when R/XFSM is set or cleared in the 
variable data rate mode, the modification must be made no later than the N-1 
bit, for the result to be affected in the current transfer. 
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FSX 
(INTERNAL) 


XINT SET XINT XINT 

DXR DXR R/XFSM RINT RINT 
LOADED LOADED LOAD DXR LOAD DXR 
READ DRR READ DRR 


Figure 9-28. Variable Continuous Mode Without Frame Synch 
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9.3 DMA Controller 


The TMS320C30 provides an on-chip Direct Memory Access (DMA) con- 
troller. The purpose of the DMA controller is to reduce the need for the CPU 
to perform input/output functions. The DMA controller can perform 
input/output operations without interfering with the operation of the CPU. 
Therefore, it is possible to interface the TMS320C30 to slow external memo- 
ries and peripherals (A/D’s, serial ports, etc.) without reducing the computa- 
tional throughput of the CPU. The result is improved system performance and 
decreased system cost. 


A DMA transfer consists of two operations: a read from a memory location and 
a write to a memory location. The DMA controller can read from and write to 
any location in the TMS320C30 memory map. This includes all memory- 
mapped peripherals. The operation of the DMA is controlled with the fol- 
lowing set of memory-mapped registers: 


DMA global control register 
DMA source address register 
DMA destination address register 
®@ DMA transfer counter register 


These registers, their memory-mapped addresses, and functions are shown in 
Figure 9-29. Each of these DMA registers will be discussed in the succeeding 
subsections. 


Register Peripheral 
Address 


DMA GLOBAL CONTROL 808000h 


Figure 9-29. Memory-Mapped Locations for a DMA Channel 
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9.3.1 DMA Global Control Register 
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The global contro! register controls the state in which the DMA controller 
operates. This register also indicates the status of the DMA, which changes 
every cycle. Source and destination addresses can be incremented, decre- 
mented, or synchronized using specified global control register bits. At system 
reset, all bits in the DMA control register are set to 0. Table 9-7 lists the reg- 
ister bits, names, and functions. Figure 9-30 shows the bit configuration of 
the global control register. 


Table 9-7. Global Control Register Bits 
| BIT | NAME | FUNCTION 
START These bits control the state in which the DMA starts and stops. 
The DMA may be stopped without any loss of data (see Table 9-8). 
2-3 | STAT These bits indicate the status of the DMA. These status bits change 
every cycle (see Table 9-9). 
| 4 INCSRC | If INCSRC = 1, the source address is incremented after every read. 
DECSRC | If DECSRC = 1, the source address is decremented after every read. 
lf INCSRC = DECSRC, the source address is not modified after a 
read. 
INCDST | If INCDST = 1, the destination address is incremented after every 
write. 
7 DECDST | If DECDST = 1, the destination address is decremented after every 
read. If INCDST = DECDST, the destination address is not modi- 
fied after a write. 


The SYNCH bits determine the timing synchronization between the 
events initiating the source and the destination transfers. The in- 
terpretation of the SYNCH bits is shown in Table 9-10. 


The TC bit affects the operation of the transfer counter. If TC = 0, 
transfers are not terminated when the transfer counter becomes 
zero. If TC = 1, transfers are terminated when the transfer counter 
becomes zero. 


lf TCINT = 1,the DMA interrupt is set when the transfer counter 
makes a transition to zero. If TCINT = 0, the DMA interrupt is not 
set when the transfer counter makes a transition to zero. 


31 30 29 28 27 26 25 24 23 19 18 17 16 

TCA Ee REE 

15 14 13 12 10 9 

rac [ ax [ox [POINT] re [SwncGn [BESOST] ener [pECoRC] mesAC] svar [START 
R/W R/W R/W R/W R/W R/W R/W R/W R R R/W R/W 


NOTE: xx = Reserved bit, read as 0. 
R = read, W = write. 


Figure 9-30. DMA Global Control Register 
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Table 9-8. START Bits and Operation of the DMA 


FUNCTION 


DMA read or write cycles in progress will be completed, any data read will 
be ignored. Any pending read or write will be cancelled. The DMA is reset 
so that when started, a new transaction is begun; i.e., a read is performed. 


START 


lf a read or write has begun, the read or write is completed before stopping, 
i.e. in the middle or at the end of a DMA transfer. If a read or write has 
not begun, no read or write is started. 


lf a DMA transfer has begun, the entire transfer is completed (including 
both read and write operations) before stopping. If a transfer has not be- 
gun, none is started. 


DMA starts from reset or restarts from the previous state. 


Table 9-9. STAT Bits and Status of the DMA 


STAT FUNCTION 


DMA is being held between DMA transfers (between a read and write). 
This is the value at reset. 

DMA is being held in the middle of a DMA transfer, i.e. between a read 
and a write. 


11 


DMA busy; i.e., DMA is performing a read or write. 


Table 9-10. SYNCH Bits and Synchronization of the DMA 


SYNCH FUNCTION 
No synchronization. Enabled interrupts are ignored 


Source synchronization. A read is performed when an enabled interrupt 
occurs. 

Destination synchronization. A write is performed when an enabled inter- 
rupt occurs. 


Source and destination synchronization. A read is performed when an en- 
abled interrupt occurs. A write is then performed when the next enabled 
interrupt occurs. 
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9.3.2 Destination and Source Address Registers 


The DMA destination and source address registers are 24-bit registers. These 


registers are used when performing the increment and decrement as specified 


by control bits DECSRC, INCSRC, DECDST, and INCDST of the DMA global 
control register. The contents of these registers specify the destination and 
source addresses. The registers are incremented or decremented at the end 
of the corresponding memory access, i.e., source register for a read, destina- 
tion register for a write. On system reset, 0 is written to these registers. 


9.3.3 Transfer Counter Register 


The transfer counter register is a 24-bit register, controlled by a 24-bit counter 
that counts down. The counter decrements upon the completion of a DMA 
memory write. In this way, It can be used to control the size of a block of data 
transferred. The transfer counter register is set to O at system reset. 


9.3.4 CPU/DMA Interrupt Enable Register 


The CPU/DMaA interrupt enable register (IE) is a 32-bit register located in the 
CPU register file. The CPU interrupt enable bits are in locations 10-0. The 
DMA interrupt enable bits are in locations 26-16. A1 in a CPU/DMA interrupt 
enable register bit enables the corresponding interrupt. A O disables the cor- 
responding interrupt. At reset, O is written to this register. 


Table 9-11 list the bits, names, and functions of the CPU/DMA interrupt en- 
able register. Figure 9-31 shows the IE register. The priority and decoding 
scheme of CPU and DMA interrupts is identical. Note that when the DMA 
receives an interrupt, this interrupt is acted upon based upon the SYNCH field 
of the DMA control register. Note that an interrupt may affect the DMA, but 
not the CPU and vice versa. Refer to Section 7. 
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Table 9-11. CPU/DMA Interrupt Enable Register Bits 


Car [name [—sruncrion 
[0 | einto | enable extemal interrupt 0 (GPU) 
[6 [exintt [enable serial port 1 transmit interupt (CPU) 


ERINT1 Enable serial port 1 receive interrupt (CPU) 
ETINTO Enable timer O interrupt (CPU) 


ETINT1 Enable timer 1 interrupt (CPU) 


Enable DMA controller interrupt (CPU) 

11-15 Read as 0 

Enable external interrupt 0 (DMA) 

Enable external interrupt 1 (DMA) 

Enable external interrupt 2 (DMA) 

Enable external interrupt 3 (DMA) 

Enable serial port 0 transmit interrupt (DMA) 

Enable serial port O receive interrupt (DMA) 
22 Enable serial port 1 transmit interrupt (DMA) 
23 Enable serial port 1 receive interrupt (DMA) 

Enable timer 0 interrupt (DMA) 
25 ETINT1 Enable timer 1 interrupt (DMA) 


7 
16 
17 
18 
19 
20 
21 


31 30 29 28 27) = 26 17 1 


25 24 23 22 21 20 19 18 6 
xx | xx] xx] xx | xx] EDINTO] ETINT1 | ETINTO] ERINT1| EXINT1 | ERINTO| EXINTO| EINT3 | EINT2 | EINT1 | EINTO 
(DMA) | (DMA) | (DMA)] (DMA) | (DMA) | (DMA) | (DMA) | (DMA) | (DMA) | (DMA) | (DMA) 


R/W R/W- R/W- R/W- R/W R/W R/W- R/W- R/W R/W R/W 


15 1413 12 11 10 9 


8 7 6 5 4 3 2 1 0 
XX | XX EDINTO/ETINT1}/ ETINTOJERINT1| EXINT1/ERINTOJEXINTO} EINTS | EINT2 | EINT1 | EINTO 
(CPU) | (CPU) | (CPU) | (CPU) | (CPU) | (CPU) | (CPU) | (CPU) | (CPU) | (CPU) | (CPU) 
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W = R/W 


NOTE: xx =Reserved bit, read as 0. 
R = read, W = write. 


Figure 9-31. CPU/DMA Interrupt Enable Register 
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9.3.5 DMA Memory Transfer Operation 


Each DMA memory transfer consists of two parts: 


1) Read data from the address specified by the DMA source register. 
2) Write data that has been read to the address specified by the DMA des- 
tination register. 


A transfer is complete only when the read and write are complete. A transfer 
may be stopped by setting the START bits to the desired value. When the 
DMA is restarted (START = 1 1), it completes any pending transfer. 


At the end of a DMA read, the source address is modified as specified by the 
SRCINC and SRCDEC bits of the DMA global control register. At the end of 
a DMA write, the destination address is modified as specified by the DSTINC 
and DSTDEC bits of the DMA global control register. At the end of every 
DMA write, the DMA transfer counter is decremented. 


DMA on-chip reads and writes (reads and writes from on-chip memory and 
peripherals) are single cycle. DMA off-chip reads are two cycles. The first 
cycle is an internal setup with the external read beginning on the following 
cycle. The external read cycle is identifical to a CPU read cycle. DMA off-chip 
writes are identical to CPU off-chip writes. 


Through the 24-bit source and destination registers, the DMA is capable of 
accessing any memory-mapped location in the TMS320C30 memory map. 
Figure 9-32 through Figure 9-34 show the number of cycles a DMA transfer 
requires, depending upon whether the source and destination are on-chip 
memory and peripherals, the external port, or the |/O port. 7 represents the 
number of transfers to be performed. C, represents the number of wait states 
for the source read. Cy, represents the number of wait states for the destination 
write. Each entry in the table represents the total cycles required to do the 7 
transfers, assuming no pipeline conflicts. 


Accompanying each table is a figure illustrating the timing of the DMA trans- 
fer. |R| and |W| represent single-cycle reads and writes, respectively. |R.R| 
and |W.W| represent multicycle reads and writes. |Cr| and |Cw| show the 
number of wait cycles for a read and write. |-| represents the cycle used as 
an internal setup for DMA external reads. 
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Figure 9-32. Timing and Number of Cycles for DMA Transfers When Destination 
is On-Chip 
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Figure 9-33. DMA Timing When Destination is a Primary Bus 
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Figure 9-34. DMA Timing When Destination is an Expansion Bus 
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Table 9-12 shows the maximum DMA transfer-rates assuming no wait states 
(C;=Cw= 0). Table 9-13 shows the maximum DMA transfer-rates assuming 
one wait state for the read (C,=1) and no wait states for the write (Cy=0). 
Table 9-14 shows the maximum DMA transfer-rates assuming one wait state 
for the read (C,; =1) and one wait state for the write (Cy, ). 


In each table, the complete transfer is considered (i.e., the time to do the read 
and the write). Since one bus access is required for the read and another for 
the write, bus transfer-rates will be twice the transfer-rate. It is also assumed 
that no conflicts with the CPU exist. 


Table 9-12. Maximum DMA Transfer Rates When C,; = Cy = 0 


INTERNAL 33.3 Mbytes/sec 33.3 Mbytes/sec 33.3 Mbytes/sec 
PRIMARY 22.2 Mbytes/sec 16.7 Mbytes/sec 22.2 Mbytes/sec 
EXPANSION 22.2 Mbytes/sec 22.2 Mbytes/sec 16.7 Mbytes/sec 


DESTINATION 


INTERNAL PRIMARY EXPANSION 


Table 9-13. Maximum DMA Transfer Rates When C,; = 1, Cy = 0 


| source INTERNAL PRIMARY EXPANSION 


INTERNAL 33.3 Mbytes/sec 33.3 Mbytes/sec 33.3 Mbytes/sec 
PRIMARY 16.7 Mbytes/sec 13.3 Mbytes/sec 16.7 Mbytes/sec 
| EXPANSION | 16.7 Mbytes/sec 16.7 Mbytes/sec 13.3 Mbytes/sec 


DESTINATION 


Table 9-14. Maximum DMA Transfer Rates When C, = 1, Cy = 1 


DESTINATION 


INTERNAL PRIMARY EXPANSION 


INTERNAL 33.3 Mbytes/sec 22.2 Mbytes/sec 22.2 Mbytes/sec 
PRIMARY 16.7 Mbytes/sec | 11.1 Mbytes/sec 16.7 Mbytes/sec 
| EXPANSION | 16.7 Mbytes/sec 16.7 Mbytes/sec 11.1 Mbytes/sec 
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9.3.6 Synchronization of DMA Channels 
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A DMA channel may be synchronized through the use of interrupts. Refer to 
Table 9-10 for the relationship between the SYNCH bits of the DMA global 
control register and the synchronization performed. This section describes the 
following four synchronization mechanisms: 


@ No synchronization (SYNCH = 0 Q) 

2] Source synchronization (SYNCH = 01) 

@ Destination synchronization (SYNCH = 1 0) 

@ Source and destination synchronization (SYNCH = 1 1) 
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No Synchronization 


When SYNCH = 0 0, no synchronization is performed. The DMA will perform reads and 
writes whenever there are no conflicts. All interrupts are ignored, and therefore can be 
considered to be globally disabled. However, no bits in the DMA interrupt enable register 
are changed. Figure 9-35 shows the synchronization mechanism when SYNCH = 0 0. 


DISABLE DMA INTERRUPTS GLOBALLY 
DMA CHANNEL PERFORMS A READ 


DMA CHANNEL PERFORMS A WRITE 
GO TO START 


Figure 9-35. No DMA Synchronization 
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Source Synchronization 


When SYNCH = 01, the DMA is synchronized to the source (see Figure 9-36). A read 
will not be performed until an interrupt is received by the DMA. Then, all DMA interrupts 
are disabled globally. However, no bits in the DMA interrupt enable register are changed. 


ENABLE DMA INTERRUPTS GLOBALLY 


GO TO START 


Figure 9-36. DMA Source Synchronization 
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Destination Synchronization 


When SYNCH = 1 0, the DMA is synchronized to the destination. First, all interrupts are 
ignored until the read is complete. Though the DMA interrupts may be considered to be 
globally disabled, no bits in the DMA interrupt enable register are changed. A write will 
not be performed until an interrupt is received by the DMA. Figure 9-37 shows the syn- 
chronization mechanism when SYNCH = 1 0. 


DMA CHANNEL PERFORMS A WRITE 
GO TO START 


Figure 9-37. DMA Destination Synchronization 
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Source and Destination Synchronization 


When SYNCH = 1 1, all interrupts are ignored, and therefore can be considered to be 
globally disabled. However, no bits in the DMA interrupt enable register are changed. A 
read is performed when an interrupt is received. A write is performed on the following 
interrupt. Source and destination synchronization when SYNCH = 1 1 is shown in Figure 


9-38. 


ENABLE DMA INTERRUPT GLOBALLY 
GO TO START 


Figure 9-38. DMA Source and Destination Synchronization 
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Section 10 


Pipeline Operation 
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TMS320C30 operation is controlled by five major functional units: fetch, de- 
code, read, execute, and DMA. To provide for maximum processor through- 
put, these units can perform in parallel, with each unit operating on a different 
instruction. The overlapping of the fetch, decode, read, and execute oper- 
ations of different instructions is called pipelining. The pipelining of these 
operations results in the high performance of the TMS320C30. The ability of 
the DMA to move data within the processor memory space results in an even 
greater utilization of the CPU with fewer interruptions of the pipeline, thus 
yielding greater performance. 


Major topics discussed in this section are as follows: 


Pipeline Structure (Section 10.1 on page 10-2) 


@ Pipeline Conflicts (Section 10.2 on page 10-4) 
_ Branch conflicts 
= Register conflicts 
- Memory conflicts 


& Resolving Memory Conflicts (Section 10.3 on page 10-14) 


@ Clocking of Memory Accesses (Section 10.4 on page 10-16) 
= Program fetches 
= Data loads and stores 
= DMA accesses 
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10.1 Pipeline Structure 
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The five major units of the TMS320C30 pipeline structure and their function 
are as follows: 


Fetch Unit (F) Fetches the instruction words from memory and 
updates the program counter (PC). 


Decode Unit (D) Decodes the instruction word and performs ad- 
dress generation. Any modification of the auxiliary 
registers and the stack pointer is controlled by this 


unit. 
Read Unit (R) lf required, reads the operands from memory. 
Execute Unit (E) If required, reads the operands from the register 


file, performs the necessary operation, and if 
needed writes results to the register file. If re- 
quired, results of previous operations are written 
to memory. 


DMA Channel (DMA) _ Reads and writes memory. 


The basic instruction has four levels: fetch, decode, read, and execute. Figure 
10-1 illustrates these four levels of the pipeline structure. The levels are in- 
dexed according to instruction and execution cycle. Also indicated is a place 
in the pipeline where all four units operate in parallel; the perfect overlap oc- 
curs at cycle (m). Those levels about to be executed are at m+1, and those 
just executed are at m-1. The TMS320C30 pipeline control allows for an ex- 
tremely high-speed execution rate by allowing an effective rate of one exe- 
cution per cycle. It also manages pipeline conflicts in a way that makes them 
transparent to the user. The user does not need to take any special precautions 
to guarantee correct operation. 


CYCLE 
INSTRUCTION . . . | m3 1 m-2 | m-1 1 m_ 1/m+1 [m2 | 
le | pb | rR Ie 
J 1 -F | pb tlried 
K | —F lp Iriel 
J | F fp | rR ie l 
PERFECT 
OVERLAP 


Figure 10-1. TMS320C30 Pipeline Structure 


Pipeline Operation - Pipeline Structure 


Priorities have been assigned to each of the functional units. The priorities 
from highest to lowest are: 


® Execute (highest) 
Read 

Decode 

Fetch 

DMA (lowest). 


When processing of an instruction is ready to pass to the next higher pipeline 
level, but that level is not ready to accept a new input, a pipeline conflict oc- 
curs. In this case, the lower priority unit waits until the higher priority unit 
completes its currently executing function. 


Despite the DMA controllers low priority, conflicts with the CPU can be min- 
imized or even eliminated by suitable data structuring since the DMA con- 
troller has its own data and address buses. 
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10.2 Pipeline Conflicts 


The pipeline conflicts of the TMS320C30 can be grouped into the following 
main categories: 


Branch Conflicts Involve most of those instructions or operations which 
read and/or modify the PC. 


Register Conflicts Involve delays that can occur when reading or writing 
registers used for address generation. 


Memory Conflicts Occur when the internal units of the TMS320C30 
compete for memory resources. 


Each of these three types is discussed in the following sections. Examples are 
included. Note in these examples, when data is refetched or an operation is 
repeated, the symbol representing the stage of the pipeline is appended with 
anumber. For example, if a fetch is performed again, the initial fetch is labeled 
F1 and the refetch is labeled F2. When an access is detained multiple cycles 
due to a ‘not ready,’ the symbols RDY and RDY are used to indicate not ready 
and ready, respectively. 


10.2.1 Branch Conflicts 
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The first class of pipeline conflicts is that which occurs with standard (non- 
delayed) branches, i.e., BR, Bcond, DBcond, CALL, IDLE, RPTB, RPTS, 
RETlcond, RETScond, interrupts, and reset. Conflicts arise with these in- 
structions and operations since during their execution, the pipeline is used 
only for the completion of the operation; other information fetched into the 
pipeline is discarded or refetched, or the pipeline is inactive. This is referred 
to as flushing the pipeline. Flushing the pipeline is necessary in these cases 
to guarantee that portions of succeeding instructions do not inadvertantly get 
partially executed. TRAPcond and CALLcond are classified somewhat differ- 
ently from the other types of branches and are considered later. 


Example 10-1 shows the code and pipeline operation for a standard branch. 
Note that one dummy fetch is performed (F1), and then after the branch ad- 
dress is available, a new fetch (F2) is performed. This dummy fetch will affect 
the cache. 
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Example 10-1. Standard Branch 


BR THREE ; Unconditional branch 
MPYF ; Not executed 
ADDF ; Not executed 
SUBF >; Not executed 
AND ; Not executed 
THREE OR ; Fetched after BR is fetched 
STI 


PIPELINE OPERATION 


THREE~PC 
BR THREE | F | D | R | E | 
OR | F1 | (nop) | (nop) | F2 | =O |... 
STE | F Pe 


RPTS and RPTB both flush the pipeline, thus allowing for the RS, RE, and 
RC registers to be loaded at the proper time relative to the flow of the pipeline. 
If these registers are loaded without the use of RPTS or RPTB, no flushing of 
the pipeline occurs. If none of the repeat modes are being used, RS, RE, and 
RC may be used as general-purpose 32-bit registers without any pipeline 
conflicts occurring. In cases such as the nesting of RPTB due to nested in- 
terrupts, it may be necessary to load and store these registers directly while 
using the repeat modes. Since up to four instructions can be fetched before 
entering the repeat mode, loads should be followed by a branch to flush the 
pipeline. If the RC is changing when an instruction is loading it, the direct load 
takes priority over the modification made by the repeat mode logic. 


Delayed branches are implemented to guarantee the fetching of the next three 
instructions. The delayed branches include BRD, BcondD, and DBcondD. 
Example 10-2 shows the code and pipeline operation for a delayed branch. 
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Example 10-2. Delayed Branch 


BRD THREE Unconditional delayed branch 


MPYF >; Executed 

ADDF :; Executed 

SUBF : Executed 

AND > Not executed 
THREE MPYF > Fetched after SUBF fetched 

PIPELINE OPERATION 
THREE>PC 

BRD THREE | F { D | R E | 
MPYF [oF 4 oDe oR oF EB 4 
ADDF | F | D | R | E | 
SUBF F | D | R | 
MPYF | F { D lee 


10.2.2 Register Conflicts 
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Register conflicts involve the reading or writing of registers used for address- 
ing purposes. These conflicts occur when the pertinent register is not ready 
to be used. The registers comprise the following three functional groups: 


Group 1 Auxiliary registers (ARO-AR7), index registers (IRO, IR1), and 
block size register (BK) 


Group 2 Data page pointer (DP) 
Group 3 System stack pointer (SP) 


If an instruction writes to one of these three groups, the use of any register 
within that particular group by the decode unit is delayed until the write is 
complete, i.e. instruction execution is completed. In Example 10-3, an auxil- 
iary register is loaded, and a different auxiliary register is used on the next in- 
struction. Since the decode stage needs the result of the write to the auxiliary 
register, the decode of this second instruction is delayed two cycles. Every 
time the decode is delayed, a refetch of the program word is performed; i.e., 
the first fetch of ADDF is at F1, followed by F2 and F3 (the final fetch). Since 
these are actual refetches, they can cause conflicts with the DMA controller 
and cache hits and misses. 
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Example 10-3. Write to an AR Followed by an AR for Address Generation 


LDI 7,AR1 ; 7 = -ARL 

NEXT MPYF *AR2,RO ; Decode delayed 2 cycles 
ADDF 
FLOAT 


PIPELINE OPERATION 


77AR1 
LDI 7,ARl1 | F | D | R f €E | 
MPYF *AR2,RO | F | D1 | D2 {| DB | R- 4... 
ADDF | F1 | F2 | F3 | D4. 
FLOAT > SEs i. 


The case for reads of these groups is similar to the case for writes. If an in- 
struction must read a member of one of these groups, the use of that particular 
group by the decode for the following instruction is delayed until the read is 
complete. The registers are read at the start of the execute cycle and therefore 
require only a one cycle delay of the following decode. For four registers (!RO, 
IR1, BK, or DP) no delay is incurred. In all other cases, including the SP, the 
delay occurs. In Example 10-4, two auxiliary registers are added together with 
the result going to an extended-precision register. The next instruction uses 
a different auxiliary register as an address register. 


Example 10-4. A Read of ARs Followed by ARs for Address Generation 


ADDI ARO,AR1,R1 ; ARO + ARI > RIL 
NEXT MPYF *++AR2,RO ; Decode delayed 1 cycle 


ADDF 
FLOAT 
. PIPELINE OPERATION 
ARO+AR1—7R1 

ADDI | F | D | R | E 4 
MPYF* ++AR2,RO | F | D1 | D2 | R | E | 
ADDF | F. | F2 | D | Rf. 
FLOAT | F | Dh. 


Note that while the DBR (decrement and branch) instruction does not use the 
auxiliary registers for addressing, its use of them as loop counters is treated 
as if it did. Therefore, the operation shown in the two previous examples can 
also occur for this instruction. 
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10.2.3 Memory Conflicts 
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Possible memory conflicts occur when the memory bandwidth of a physical 
memory space is exceeded. For example, RAM blocks 0 and 1 and the ROM 
block can support only two accesses every cycle. The external interface can 
support only one access per cycle. Some conditions under which memory 
conflicts can be easily avoided are discussed in Section 10.3. 


Memory pipeline conflicts consist of the following four types: 


Program Wait A program fetch is prevented from begin- 
ning. 

Program Fetch Incomplete A program fetch has begun, but is not yet 
complete. 

Execute Only An instruction sequence requires three 


CPU-data accesses in a single cycle. 


Hold Everything _ A primary or expansion bus operation must 
complete before another one can proceed. 


These four types of memory conflicts are discussed in the succeeding para- 
graphs and examples provided. 


Program Wait 


Two conditions can prevent the program fetch from beginning: 


®@ The start of a CPU-data access when: 
- Two CPU-data accesses are made to an internal RAM or ROM 
block, and a program fetch from the same block is necessary. 
= One of the external ports is starting a CPU-data access, and a 
program fetch from the same port is necessary. 


® A multicycle CPU-data access or DMA-data access over the external bus 
is needed. 
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An example of program wait until a CPU-data access completes is illustrated 
in Example 10-5. In this case, *~ARO and *AR1 are both pointing to data in 
RAM block 0, and the MPYF instruction will be fetched from RAM block 0. 
This results in the conflict shown in Example 10-5. Since no more than two 
accesses can be made to RAM block 0 in a single cycle, the program fetch 
cannot begin, and must wait until the CPU-data accesses are complete. 


Example 10-5. Program Wait Until CPU-Data Access Completes 
ADDF3 *ARO,*AR1,RO 


FIX 
MPYF 
ADDF3 
NEGB 
PIPELINE OPERATION 
"ARO MEMORY READ 
*AR1 MEMORY READ 
ADDF3 *ARO,*AR1,RO i; F | D | R [| E f 
FIX | F |D|RdE l 
(wait) 
MPYF | F | F | D | RYE 
ADDF . | F | D YR | 
NEGB | F | D | 


Example 10-6 shows an example of a program wait due to a multicycle data- 

data access or a multicycle DMA access. The ADDF, MPYF, and SUBF are 

fetched from some portion in memory other than the external port the DMA 
requires. The DMA begins a multicycle access. The program fetch corre- 
sponding to the CALL is made to the same external port the DMA is using. , 
Even though the DMA has the lowest priority, multi-cycle access cannot be 
aborted. The program fetch must therefore wait until the DMA access com- 3 
pletes. 


Example 10-6. Program Wait Due to Multicycle Access 


ADDF | F | D | R | E | 

MPYF | F | D | R | E | 

SUBF | F | D | R | E | 
(wait) 

CALL | F | F [| D | 


| 2 cycle DMA access | 
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Program Fetch Incomplete 


A program fetch incomplete occurs when a program fetch takes more than one 
cycle to complete due to wait states. In Example 10-7, the MPYF and ADDF 
are fetched from memory that supports single-cycle accesses. The SUBF is 
fetched from memory requiring one wait state. 


Example 10-7. Multicycle Program Memory Fetches 
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MPYF | F | OD ; R | E | 
ADDF | F | DBD {| R || E | 
RDY RDY 
SUBF OOF oP Dr {aA foe: 4 
ADDI | F | D | R | 


Execute Only 


The Execute Only type of memory pipeline conflicts occurs when a sequence 
of instructions requires three CPU-data accesses in a single cycle or when 
performing an interlocked load. The three cases where this occurs are: 


e An instruction that performs a store, followed by an instruction that does 
two memory reads. 


@ An instruction that performs two stores, followed by an instruction that 
performs at least one memory read. 


@ An interlocked load (LDII or LDFI) instruction is performed, and XF1 = 
1. | 
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The first case is shown in Example 10-8. Since this sequence requires three 
data memory accesses and only two are availiable, only the execute phase of 
the pipeline is allowed to proceed. The dual reads required by the LDF || LDF 
will be delayed one cycle. Note that a refetch of the next instruction can oc- 


cur. 


Example 10-8. Single Store Followed by Two Reads 


STF 
LDF 
| | LDF 


STF RO,*AR1 


LDF | | LDF 


| 


F 


RO, *AR1 ; RO > *ARI 
*AR2,R1 ; *AR2 > R1 in parallel with 
*AR3,R2 ; *AR3 7 R2 


PIPELINE OPERATION 


RO>*AR1 
| DB | R [| E | 
*AR2>R1 
*AR3~7R2 
| F | D {| R1 [| R2 | E | 
| F | Di | D2 [| Rh... 
| Fi | F2 | Dk. 


Example 10-9 shows a parallel store followed by a single load or read. Since 
the two parallel stores are required, the next CPU-data memory read must wait 
a cycle before beginning. One program memory refetch may occur. 


Example 10-9. Parallel Store Followed by Single Read 


STF 
| | STF 
ADDF 
TACK 
ASH 


STF RO,*AR1 
||STF R2,*AR1 


ADDF @SUM,R1 
IACK 


ASH 


F 


RO, *ARO ; RO > *ARO in parallel with 
R2,*AR1 ; R2 > *ARI 
@SUM,R1 ; R1 + @SUM > R1 


PIPELINE OPERATION 


RO—*ARO 
R2—*AR1 
| D | R [| E | 
(wait) @SUM 
| F | OD [| R | E | 
| F | D1 | D2 | R | 
| Fl | F2 | D | 
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The final case involves an interlocked load (LDII or LDF!) instruction and XF1 
= 1. Since the interlocked loads use the XF1 pin as an acknowledge that the 
read can complete, they may need to extend the read cycle, as shown in Ex- 
ample 10-10. Note that a program refetch may occur. 


Example 10-10. Interlocked Load 


NOT | F | D | R | CE | 

XFi=1 XF1=0 
LDEtT | F | D | R | R | E | 
ADDI | F | D1 | D2 | Ra. 
CMPI | F1 | F2 | D4... 


Hold Everything 


The three types of Hold Everything memory pipeline conflicts are: 


@ A CPU-data load or store cannot be performed because an external port 
is busy. 


@ An external load that takes more than one cycle. 


@ Conditional calls and traps. 


The first type of Hold Everything conflict occurs when one of the external 
ports is busy due to an access that has started but is not complete. In Example 
10-11, the first store is a two-cycle store. The CPU writes the data to an ex- 
ternal port. The port control then takes two cycles to complete the data-data 
write. The LDF is a read over the same external port. Since the store is not 
complete, the LDF will continue to be attempted until the port is available. 
For this case, the first dummy fetch occurs at the same time as D2. 


Example 10-11. Busy External Port 


10 STF RO,@DMA1 
LDF @DMA2 , RO 


PIPELINE OPERATION 


| 2-cycle DMA access | 


STF | F | OD { R ue | 
LDF | F | D | nop | R ae = | 
| F | DI | D2 | R ee 
| F1 {| F2 | OD | 
| oF lies 
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The second type of Hold Everything conflict involves multicycle data reads. 
The read has begun and continues until completed. In Example 10-12, the 
LDF is performed from an external memory that requires several cycles to 
complete. 


Example 10-12. Multicycle Data Reads 


| F | DB {| R | E | 
| 2-cycle read | 
LDF @DMA,RO | F | D | R | R [| E ] 
| F | D1 | D2 | R- 4... 
| oF | nop | OD lids 


| FI. | F2. f.. 


The final type of Hold Everything conflict deals with conditional calls and 
traps, which are different from the other branch instructions. Whereas the 
other branch instructions are conditional loads, the conditional calls and traps 
are conditional stores, which take one cycle more than a standard branch (see 
Example 10-13). 


Example 10-13. Conditional Calls and Traps 


(store) 


CALLcond | F | D | R | o£ 
| Ft | (nop) | (nop) | F2 [| F3 | D_ | 
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10.3 Resolving Memory Conflicts 


10-14 


If program fetches and data accesses are performed in such a manner that the 
resources being used cannot provide the necessary bandwidth, the program 
fetch is delayed until the data access is complete. Certain configurations of 
program fetch and data accesses yield conditions under which the 
TMS320C30 can achieve maximum throughput. Table 10-1 shows how many 
accesses can be performed from the different memory spaces when it is nec- 
essary to do a program fetch and a single data access, and still achieve maxi- 
mum performance (one cycle). There are four cases that achieve one cycle 
maximization (see Table 10-1). Table 10-2 shows how many accesses can 
be performed from the different memory spaces when it is necessary to do a 
program fetch and two data accesses, still achieving maximum performance 
(one cycle). There are six cases that achieve this maximization (see Table 
10-2). 


Table 10-1. One Program Fetch and One-Data Access for Maximum 


Performance 


PRIMARY BUS ACCESSES FROM EXPANSION BUS 

ACCESSES DUAL-ACCESS OR PERIPHERAL 
eienieene revere hd sili ball alee nald alana eM idaho 

A A A CA AV, RNS 


2 from any 
combination 
of internal memory 
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Table 10-2. One Program Fetch and Two Data Accesses for 
Maximum Performance 


CASE # PRIMARY BUS ACCESSES FROM EXPANSION OR 
ACCESSES DUAL-ACCESS PERIPHERAL BUS 
Eeapeeeecereeemasowen iA blho —— eaten 
eo from any 
combination 
of internal memory 


[2 It Program [it Data Cite 
[sb ts | rom 


2 from same internal 
memory block and 
1 from a different 

internal memory 


3 from different 
internal memory 
blocks 
2 from any 1 
combination 
of internal memory 
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10.4 Clocking Of Memory Accesses 


Internal clock phases (H1 and H3) and their relationship to memory accesses 
are discussed in this section to show how the TMS320C30 handles multiple 
memory accesses. Whereas the previous section discussed the interaction 
between sequences of instructions, this section discusses the flow of data on 
an individual instruction basis. 


Each major clock period of 60 ns is composed of two minor clock periods of 
30 ns, labeled as H3 and H1. 


- Major Clock Period — 


H1 


H3 


The precise operation of memory reads and writes can be defined, based upon 
these minor clock periods. The types of memory operations which can occur 
are program fetches, data loads and stores, and DMA accesses. 


10.4.1 Program Fetches 


Internal program fetches are always performed during H3 unless a single data 
store must occur at the same time, due to another instruction in the pipeline. 
In this case, the program fetch occurs during H1 and the data store during 
H3. 


External program fetches always start at the beginning of H3 with the address 
being presented on the external bus. At the end of H1, they are completed 
with the latching of the instruction word. 


10.4.2 Data Loads and Stores 
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Four types of instructions perform loads, memory reads, and stores: two-ope- 
rand instructions, three-operand instructions, multiplier/ALU operation with 
store instructions, and parallel multiply and add instructions. See Section 6 for 
detailed information on addressing modes. 


Two-Operand Instruction Memory Accesses 


Two-operand instructions include all those instructions with bits 31-29 being 
000 or 010 (see Figure 10-2). In the case of a data read, bits 15-0 represent 
the sre operand. Internal data reads are always performed during H1. External 
data reads always start at the beginning of H3 with the address being pre- 
sented on the external bus, and complete with the latching of the instruction 
word at the end of H1. 
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In the case of a data store, bits 15-0 represent the dst operand. Internal data 
stores are performed during H3. External data stores always start at the be- 
ginning of H3 with the address and data being presented on the external bus. 


24 23 1615 87 0 


ON A 


Figure 10-2. Two-Operand Instruction Word 


Three-Operand Instruction Memory Reads 


Three-operand instructions include all instructions with bits 31-29 being 001 
(see Figure 10-3). The source operands, src7 and src2, come from either reg- 
isters or memory. When one or more of the source operands are from memory, 
these instructions are always memory reads. 


If only one of the source operands is from memory (either src7 or src2) and 
is located in internal memory, the data is read during H1. If the single memory 
source operand is in external memory, the read starts at the beginning of H3, 
with the address being presented on the external bus, and completes with the 
latching of the data word at the end of H1. 


If both source operands are to be fetched from memory, then several cases 
occur. If both operands are located in internal memory, the src7 is performed 
during H3 and src2 during H1, thus completing two memory reads in a single 
cycle. 


If src7 is in internal memory and src2 in external memory, the src2 access is 
begun at the start of H3 and latched at the end of H1. At the same time, the 
src7 access to internal memory is performed during H3. Again, two memory 
reads are completed in a single cycle. 


If src7 is in external memory and src2 in internal memory, two cycles are nec- 
essary to complete the two reads. In the first cycle, the internal src2 access 
is performed. The sre7 is also performed, but not latched until the next H3. 


lf src7 and src2 are both from external memory, two cycles are required to 
complete the two reads. In the first cycle, the src7 access is performed and 
loaded on the next H3; in the second cycle, the src2 access is performed and 
loaded on that cycle’s H1. 


24 23 1615 87 0 


oT ewe [FP ee 


Figure 10-3. Three-Operand Instruction Word 
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Operations with Parallel Stores 


The next class of instructions includes all instructions that have a store, in 
parallel with another instruction. Bits 31 and 30 for these instruction are equal 
to 11. 


For those operations that perform a multiply or ALU operation in parallel with 
a store, the instruction word format is shown in Figure 10-4. If the store op- 
eration to dst2 is external or internal, it is performed during H3. 


lf the memory read operation is external, it is started at the beginning of H3 
and latched at the end of H1. If the memory read operation is internal, it is 
performed during H1. Note that memory reads are performed by the CPU 
during the read (R) phase of the pipeline, and stores during the execute (E) 
phase. 


24 23 1615 87 0 


fT oon | ot [os [ae 


Figure 10-4. A Multiply or CPU Operation with a Parallel Store 


The instruction word format for those instructions that have parallel stores to 
memory is shown in Figure 10-5. If both destination operands, dst7 and 
dst2, are located in internal memory, dst7 is stored during H3 and dst2 during 
H1, thus completing two memory stores in a single cycle. 


lf dst7 is in external memory and dst2 in internal memory, the dst7 store is 
begun at the start of H3. The dsit2 store to internal memory is performed 
during H1. Again, two memory stores are completed in a single cycle. 


lf dst7 is in internal memory and dst2 in external memory, two cycles are ne- 
cessary to complete the dst2 store. In the first cycle, the internal dst7 store is 
performed during H3. During the next cycle, the dst2 store is performed be- 
ginning in H3. 


lf dst7 and dst2 are both from external memory, two cycles are necessary to 
complete the dst2 store. In the first cycle, the dst7 access is performed; in the 
second cycle, the dst2 access is performed. 


24 23 1615 87 0 


Sa ane | sek [ooo] at [ae ae 


Figure 10-5. Two Parallel Stores 
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Parallel Multiplies and Adds 


The considerations of memory addressing for parallel multiplies and adds is 
similar to that for three-operand instructions. The parallel multiplies and adds 
include all instructions with bits 31-30 equal to 10 (see Figure 10-6). 


For these operations, src3 and src4 are both located in memory. If both op- 
erands are located in internal memory, src3 is performed during H3 and src4 
during H1, thus completing two memory reads in a single cycle. 


lf src3 is in internal memory and src4 in external memory, the src4 access is 
begun at the start of H3 and latched at the end of H1. At the same time, the 
src3 access to internal memory is performed during H3. Again, two memory 
reads are completed in a single cycle. 


If src3 is in external memory and src4 in internal memory, two cycles are nec- 
essary to complete the two reads. In the first cycle, the internal src4 access 
is performed. During the H3 of the next cycle, the src3 access is performed. 


lf sre3 and src4 are both from external memory, two cycles are necessary to 
complete the two reads. In the first cycle, the src3 access is performed; in the 
second cycle, the src4 access is performed. 


31 24 23 16 15 87 0 


ol Somen] F Jaf wet Pod [wee 


Figure 10-6. Parallel Multiplies and Adds 
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Section 11 


Assembly Language Instructions 


The TMS320C30 assembly language instruction set supports numeric- 
intensive signal processing and general-purpose applications. The instructions 
are Organized into major groups consisting of load and store, two- or three- 
operand arithmetic/logical, parallel, program control, and interlocked oper- 
ations instructions. The addressing modes used with the instructions are 
described in Section 6. 


An additional feature of the TMS320C30 instruction set is the capability of 
using one of 19 condition codes with any of the 10 conditional instructions, 
such as LDFcond. This section defines the condition codes and flags. 


The assembler allows optional syntax forms to simplify the assembly language 
for special-case instructions. These optional forms are listed and explained. 


Each of the individual instructions is described and listed in alphabetical order. 
An illustration showing an example instruction (see pages 11-15 through 
11-17) is provided to show the special format used and explain its content. 


Major topics discussed in this section are as follows: 


® Instruction Set (Section 11.1 on page 11-2) 
= Load and store instructions 
a Two-operand arithmetic/logical instructions 
= Three-operand arithmetic/logical instructions 
= Program control instructions 
- Interlocked operations instructions 
7 Parallel operations instructions 


® Condition Codes and Flags (Section 11.2 on page 11-8) 


® Individual Instructions (Section 11.3 on page 11-11) 
= Symbols and abbreviations used in instructions 
= Optional assembler syntaxes 
ms Individual instruction descriptions alphabetized and including: 

Syntax 
Operation 
Operands 
Encoding 
Description 
Cycles 
Status bits 
Mode bit 
Example(s) 
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11.1 Instruction Set 


The TMS320C30 instruction set is exceptionally well suited to digital signal 
processing and other numeric-intensive applications. All instructions are a 
single machine word long, and most instructions take a single cycle to exe- 
cute. In addition to multiply and accumulate instructions, the TMS320C30 
possesses a full complement of general-purpose instructions. 


The instruction set contains 114 instructions organized into the following 
functional groups: 


Load and store 

Two-operand arithmetic/logical 
Three-operand arithmetic/logical 
Program control 

Interlocked operations 

Parallel operations. 


Each of these groups is discussed in the succeeding subsections. 


11.1.1 Load and Store Instructions 


The TMS320C30 supports 12 load and store instructions (see Table 11-1). 
These instructions can: 


e Load a word from memory into a register, 
@ Store a word from a register into memory, or 
@ Manipulate data on the system stack. 


Two of these instructions can load data conditionally. This is useful for locat- 
ing the maximum or minimum value in a data set. See Section 11.2 for detailed 
information on condition codes. 


Table 11-1. Load and Store Instructions 


INSTRUCTION DESCRIPTION INSTRUCTION DESCRIPTION 
LDE Load floating-point exponent | POP Pop integer from stack 


LDF Load floating-point value POPF Pop floating-point value from 
stack 
LDFcond Load floating-point value PUSH Push integer on stack 
conditionally 
Load integer PUSHF Push floating-point value on 
eo ae 


LDIicond Load integer conditionally Store | Store floating-point value | -point value 
| LDM | Load floating-point mantissa — Store integer 
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11.1.2 Two-Operand Instructions 


The TMS320C30 supports a complete set of two-operand arithmetic and 
logical instructions. The two operands are the source and destination. The 
source operand may be a memory word, a register, or a part of the instruction 
word. The destination operand is always a register. 


These instructions provide integer, floating-point, or logical operations, and 
multiprecision arithmetic. Table 11-2 lists these instructions. 


Table 11-2. Two-Operand Instructions 


DESCRIPTION INSTRUCTION DESCRIPTION 
N 
O 


ABSF Absolute value of a floating- Normalize floating-point value 
point number 


| ANDN _ T] Bitwise logical-AND with Rotate right 
complement 
ASH T{ Arithmetic shift RORC Rotate right through carry 
CMPF Tt} Compare floating-point values SUBB T| Subtract integers with borrow 


CMPI tT SUBC Subtract integers conditionally 


FIX Convert floating-point value to SUBF Subtract floating-point values 
integer 
FLOAT Convert integer to floating-point SUBI Subtract integer 
value 
LSH tT] Logical shift SUBRB Subtract reverse integer with 
borrow 
MPYF T! Multiply floating-point values SUBRF Subtract reverse floating-point 
value 


tT Two- and three-operand versions 


2 
~” 
=| 
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OQ 
= 
2) 
2 


Assembly Language Instructions - Instruction Set 


11.1.3 Three-Operand Instructions 


Most instructions have only two operands; however, some arithmetic and 
logical instructions have three-operand versions. Three-operand instructions 
allow the TMS320C30 to read two operands from memory or the CPU register 
file in a single cycle and store the results in a register. The following differ- 
entiates the two- and three-operand instructions: 


@ Two-operand instructions have a single source operand (or shift count) 
and a destination operand. 


@ Three-operand instructions may have two source operands (or one 
source operand and a count operand) and a destination operand. A 
source operand may be a memory word or a register. The destination 
of a three-operand instruction is always a register. 


Table 11-3 lists the instructions that have three-operand versions. Note that 
the ‘3’ in the mnemonic can be omitted from three-operand instructions (see 
Section 11.3.2). 


Table 11-3. Three-Operand Instructions 


AND3 Bitwise logical-AND SUBB3 Subtract integers with borrow 


ANDN3 Bitwise logical-AND with SUBF3 Subtract floating-point values 
complement 


CMPRS 
CMPIB 


11.1.4 Program Control Instructions 


The program-control instruction group consists of all of those instructions 
which affect program flow. The repeat mode allows repetition of a block of 
code (RPTB) or of a single line of code (RPTS). Both standard and delayed 
(single-cycle) branching are supported. Several of the program control in- 
structions are capable of conditional operations (see Section 11.2 for detailed 
information on condition codes). Table 11-4 lists the program control in- 
structions. 


11-4 
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Table 11-4. Program Control Instructions 


NSTRUCTION DESCRIPTION INSTRUCTION DESCRIPTION 
Branch conditionally (standard) IDLE Idle until interrupt 
Branch conditionally (delayed) | NOP | Nooperation 


Branch unconditionally RETI| cond Return from interrupt 
(standard) conditionally 

BRD Branch unconditionally RETS cond Return from subroutine 
(delayed) conditionally 


CALL Call subroutine RPTB Repeat block of instructions 
CALLcond Call subroutine conditionally RPTS Repeat single instruction 


DBcond Decrement and branch SWI Software interrupt 
conditionally (standard) 
DBcondD TRAP cond Trap conditionally 


11.1.5 Interlocked Operations Instructions 


Decrement and branch 
conditionally (delayed) 


The interlocked operations instructions support multiprocessor communi- 
cation. Through the use of external signals, these instructions allow for pow- 
erful synchronization mechanisms. They also guarantee the integrity of the 
communication and result in a high-speed operation. Refer to Section 7 for 
examples of the use of interlocked instructions. Table 11-5 lists the five in- 
terlocked operations instructions. 


Table 11-5. Interlocked Operations Instructions 


NSTRUCTION DESCRIPTION INSTRUCTION DESCRIPTION 


LDFI Load floating-point value, STFI Store floating-point value, 
interlocked interlocked 


LDII Load integer, interlocked STII Store integer, interlocked 
SIGI Signal, interlocked 


11.1.6 Parallel Operations Instructions 


The parallel-operations instructions group allows for a high degree of paral- 
lelism. Some of the TMS320C30 instructions can occur in pairs that will be 
executed in parallel. These parallel instructions provide: 


®@ Parallel loading of registers, 

@ Parallel arithmetic operations, or 

e Arithmetic/logical instructions used in parallel with a store instruction. 
Each instruction in a pair is entered as a separate source statement. The sec- 


ond instruction in the pair must be preceded by two vertical bars (||). Table 
11-6 lists the valid instruction pairs. 
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Table 11-6. Parallel Instructions 


| MNEMONIC DESCRIPTION OPERATION 


PARALLEL ARITHMETIC WITH STORE INSTRUCTIONS 


ABSF Absolute value of a floating-point lsrc2| — dst1 
[| STF || sre3 > dst2 
ABSI Absolute value of an integer isrc2| — dst 
|| STI || src3 — dst2 
ADDF3 Add floating-point srci1 + src2 —> dst1 
|| STF || src3 > dst2 
ADDI3 Add integer src1 + src2 — dst1 
|] STI || src3 > dst2 
AND3 Bitwise logical-AND src1 AND src2 — dst1 
|| STI || src3 > dst2 


ASH3 Arithmetic shift If count > 0: 
\| STI src2 << count > dsti 

|| src3 — dst2 
Else: 

stc2 >> |count| — dst1 
| src3 — dst2 


FIX Convert floating-point to integer Fix(src2) — dst1 
|| STI || src3 — dst2 
FLOAT Convert integer to floating-point Float(src2) — dst1 
|| STF || sre3 — dst2 
LDF Load floating-point src2 — dst1 
|| STF || src3 > dst2 
LDI Load integer src2 — dst1 
{| STI || sre3 > dst2 


LSH3 Logical shift If count > 0: 
|| STI src2 << count > dst1 
|| src3 — dst2 
Else: 
src2 >> |count| ~ dst1 
|| src3 — dst2 


MPYF3 Multiply floating-point srcl x src2 — dst1 
|| STF || src3 > dst2 

MPYI3 Multiply integer src1 x src2 — dst1 
|| STI || src3 > dst2 

NEGF Negate floating-point O- src2 —> dst 
|| STF || src3 > dst2 


LEGEND: 
src1 — register addr (RO-R7) src2 — indirect addr (disp = 0, 1, IRO, IR1) 
src3 — register addr (RO-R7) src4 — indirect addr (disp = 0, 1, |RO, IR1) 
dst1 — register addr (RO-R7) dst2 — indirect addr (disp = 0, 1, |RO, IR1) 
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Table 11-6. Parallel Instructions (Concluded) 


MNEMONIC DESCRIPTION | OPERATION 


PARALLEL ARITHMETIC WITH STORE INSTRUCTIONS (Concluded) 


NEG! Negate integer — O - src2 —> dst1 
|| STI || src3 — dst2 
NOT3 Complement src dst 
[| STI || src3 > dst2 
OR3 Bitwise logical-OR src1 OR src2 — dst1 
[| STI || src3 > dst2 
STF Store floating-point srci — dst1 
|| STF || src3 — dst2 
STI Store integer src1 — dst1 
|| STI || src3 > dst2 
SUBF3 Subtract floating-point scri - src2 — dst1 
i SEF || src3 > dst2 
SUBI3 Subtract integer src1 - src2 — dst1 
[| STI | {| src3 > dst2 


XOR3 Bitwise exclusive-OR src1 XOR src2 > dst1 
|| STI || src3 > dst2 


PARALLEL LOAD INSTRUCTIONS 
LDF Load floating-point src2 — dst1 
{| LDF || src4 > dst2 
LDI Load integer src2 — dst1 
|| LDI || src4 > dst2 
PARALLEL MULTIPLY AND ADD/SUBTRACT INSTRUCTIONS 


Multiply and add floating-point op1 x op2 > op3 
|| op4 + op5 ~> op6 


MPYF3 


\| ADDF3 
MPYF3 Multiply and subtract floating-point op1 x op2 > op3 
|| SUBF3 || op4 - op5 > op6 


MPYI3 Multiply and add integer op! x op2 ~ op3 
|| ADDI3 || op4 + op5 — op6 


MPYI3 Multiply and subtract integer op1 x op2 ~ op3 


11 SUBI3 || op4 - op5 > op6 
LEGEND: 
src1 — register addr (RO-R7) src2 — indirect addr (disp = 0, 1, IRO, IR1) 
sre3 — register addr (RO-R7) sre4 — indirect addr (disp = 0, 1, |RO, IR1) 11 
dst1 — register addr (RO-R7) dst2 — indirect addr (disp = 0, 1, 1RO, IR1) 
op3 -— register addr (RO or R1) op6 -— register addr (R2 or R3) 


op1,op2,op4,op5 — Two of these operands must be specified using register addr, 
and two must be specified using indirect addr. 
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11.2 Condition Codes and Flags 


The TMS320C30 provides 20 condition codes that can be used with any of 
the conditional instructions, such as RETScond or LDFcond. The conditions 
include signed and unsigned comparisons, comparisons to zero, and compar- 
isons based on the status of individual condition flags. Note that all condi- 
tional instructions can accept the suffix ‘U’ to indicate unconditional 
operation. 


Seven condition flags provide information related to properties of the result 
of arithmetic and logical instructions. The condition flags are stored in the 
status register (ST). These flags are modified by the majority of instructions 
according to whether a result is generated when performing the specified op- 
eration to infinite precision or an output is written to the destination register. 
The formats for output values are shown in Table 11-7. 


Table 11-7. Output Value Formats 


TYPE OF OPERATION OUTPUT FORMAT 


Floating-point 8-bit exponent, 1 sign bit, 31-bit fraction 
Integer 32-bit integer 
Logical 32-bit unsigned integer 


The condition flags are affected by instructions in only the following cases: 


1) The destination register is one of the extended-precision registers (RO - 
R7) 

2) The instruction is one of the compare instructions (CMPF, CMPF3, 
CMPI, CMPI3, TSTB, or TSTB3). 


Case 1 allows for modification of the registers used for addressing without 
affecting the condition flags during computation. Case 2 makes it possible to 
set the condition flags based upon the contents of any of the CPU registers. 


The following list defines the condition flags and describes how the flags are 
set by most instructions. For specific details of the effect of a particular in- 
struction on the condition flags, see the description of that instruction in 
Section 9.2. 


N Negative Condition Flag. Logical operations assign N the state 
of the MSB of the output value. For integer and floating-point oper- 
ations, N is set if the result is negative, and cleared otherwise. Zero 
is considered to be positive. 


Z Zero Condition Flag. For logical, integer, and floating-point oper- 
ations, Z is set if the output is 0, and cleared otherwise. 


V Overflow Condition Flag. For integer operations, V is set if the 
result does not fit into the format specified for the destination (i.e., 
-232 < result < 232 - 1). Otherwise, V is cleared. For floating-point 
operations, V is set if the exponent of the result is greater than 127, 
otherwise,V is cleared. Logical operations always clear V. 


C Carry Flag. When an integer addition is performed, C is set if a carry 
occurs out of the bit corresponding to the MSB of the output. When 
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UF 


LV 


LUF 


an integer subtraction is performed, C is set if a borrow occurs into the 
bit corresponding to the MSB of the output. Otherwise, for integer 
operations, C is cleared. The carry flag is unaffected by floating-point 
and logical operations. 


Floating-Point Underflow Condition Flag. A floating-point 
underflow occurs whenever the exponent of the result is less than or 
equal to -128. If a floating-point underflow occurs, UF is set, and the 
Output value is set to O. UF is cleared if a floating-point underflow 
does not occur. 


Latched Overflow Condition Flag. LV is set whenever V (over- 
flow condition flag) is set. Otherwise, it is unchanged. LV may only 
be cleared by a processor reset or by modifying it in the status register 
(ST). 


Latched Underfliow Condition Flag. LUF is set whenever UF 
(floating-point underflow flag) is set. LUF may only be cleared by a 
processor reset or by modifying it in the status register (ST). 


Table 11-8 lists the condition mnemonic, code, description, and flag for each 
of the 19 conditions. 
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Table 11-8. Condition Codes and Flags 


CONDITION| CODE | DESCRIPTION FLAG 


UNCONDITIONAL COMPARES 


00000 


UNSIGNED COMPARES 


Lower than | 
Lower or same 
Higher than 
Higher or same 
Equal 
Not Equal 


SIGNED COMPARES 


Less than 
Less than or equal 
Greater than 

Greater than or equal 
Equal 
Not equal 


COMPARE TO ZERO 


Zero 
Not zero 
Positive 
Negative 
Nonnegative 


COMPARE TO CONDITION FLAGS 


Nonnegative 
Negative 
Nonzero 
Zero 

No overflow 
Overflow 

No underflow 

Underflow 

No carry 

Carry 

No latched overflow 

Latched overflow 

No latched floating-point underflow 
Latched floating-point underflow 
Zero or floating-point underflow 


~ Logical complement 
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11.3 Individual Instructions 


This section contains the individual assembly language instructions for the 
TMS320C30. The instructions are listed in alphabetical order. Information, 
such as assembler syntax, operation, operands, encoding, description, cycles, 
status bits, mode bit, and examples, is provided for each instruction. An ex- 
ample instruction precedes the individual instruction listings to show the 
special format used and explain its content. 


Preceding the individual instruction descriptions, the symbols and abbrevi- 
ations used in the individual instructions are defined. In addition, some op- 
tional syntax forms allowed by the assembler are described. 


A functional grouping of the instructions is provided in Section 1.6. A com- 
plete instruction set summary can be found in Section 1.6.8. Appendix B lists 
the opcodes for all the instructions. Refer to Section 6 for information on 
memory addressing. Code examples using many of the instructions are given 
in Section 12, Software Applications. 


11.3.1 Symbols and Abbreviations 


Table 11-9 lists the symbols and abbreviations used in the individual instruc- 
tion descriptions. 
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Table 11-9. Instruction Symbols 


symeol_[—~=~“‘“*S*SEANINGS 


Source operand 

Source operand 1 
Source operand 2 
Source operand 3 
Source operand 4 


Destination operand 
Destination operand 1 
Destination operand 2 
Displacement 
Condition 

Shift count 


General addressing modes 
Three-operand addressing modes 
Parallel addressing modes 
Conditional-branch addressing modes 


Auxiliary register n 

Index register n 

Register address n 

Repeat count register 
Repeat end address register 
Repeat start address register 
Status register 


Carry bit 

Global interrupt enable bit 
Trap vector 

Program counter 

Repeat mode flag 

System stack pointer 


Absolute value of x 

Assign the value of x to destination y 
Mantissa field (sign + fraction) of x 
Exponent field of x 


Operation 1 performed in parallel with operation 2 


Bitwise logical-AND of x and y 
Bitwise logical-OR of x and y 
Bitwise logical-XOR of x and y 
Bitwise logical-complement of x 


Shift x to the left y bits 

Shift x to the right y bits 

Increment SP and use incremented SP as address 
Use SP as address and decrement SP 
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11.3.2 Optional Assembler Syntaxes 


The assembler allows a relaxed syntax form for some of the instructions. 
These optional forms simplify the assembly language so that special-case 
syntax can be ignored for some of the instructions. The following is a list of 
these optional syntax forms. 


The destination register can be omitted on unary arithmetic and logical 
operations when the same register is used as a Source. For example, 


ABSI  RO,RO can be written as ABSI RO 


Instructions affected: ABSI, ABSF, FIX, FLOAT, NEGB, NEGF, NEGI, 
NORM, NOT, RND. 


All 3-operand instructions can be written without the ‘3’. For example, 
ADDI3 RO,R1,R2 can be written as ADDI RO,R1,R2 


Instructions affected: ADDC3, ADDF3, ADDI3, AND3, ANDN3, ASHS3, 
LSH3, MPYF3, MPYI3, OR3, SUBB3, SUBF3, SUBI3, XOR3. 


This also applies to all the pertinent parallel instructions. 


All 3-operand comparison instructions can be written without the ‘3’. 
For example, 


CMPI3 RO,*ARO can be written as CMPI RO,*ARO 
Instructions affected: CMPI3, CMPF3, TSTB3. 


Indirect operands with an explicit O displacement are allowed. In 
3-operand or parallel instructions, operands with O displacement are 
automatically converted to “no-displacement” mode. For example: 


LDI *+ARO(0),R1 
is legal 
Also 
ADDI3 *+ARO(0),R1,R2 is equivalent to ADDI3 *ARO,R1,R2 


Indirect operands can be written with no displacement, in which case a 
displacement of one is assumed. For example, 


LDI *ARO++(1),RC can be written LDI *ARO++,RO 


All conditional instructions accept the suffix ‘U’ to indicate uncondi- 
tional operation. Also, the U can be omitted from unconditional short 
branch instructions. For example: 


BU label can be written B label 


Labels can be written with or without a trailing colon. For example: 


labelO: NOP 
labell NOP 
label2: 

NOP 
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Empty expressions are not allowed for the displacement in indirect 
mode: 


LDI *+ARO(),RO is not legal 
Long immediate mode operands (destination of BR and CALL) can be 
written with an at-sign: 

BR label can be written BR @label 
The LDP pseudo-op can be used to load a register (usually DP) with the 
8 MSBs of a relocatable address. The instruction is written: 

LDP addr,REG or LDP @addr,REG 
The at-sign is optional. 

If the destination REG is the DP, it can be omitted. LDP generates a LDI 
instruction with an immediate operand, and a special relocation type. 
Parallel instructions can be written in either order. For example: 

ADDI can be written as STI 

Pb saat || ADDI 
The parallel bars indicating part 2 of a parallel instruction can be written 
anywhere on the line, from column O to the mnemonic. For example: 

ADDI can be written as ADDI 

PP cusSrr i oS 


If the second operand of a parallel instruction is the same as the third 
(destination register) operand, the third operand can be omitted. This 
allows the writing of 3-operand parallel instructions that ‘look like’ nor- 
mal 2-operand instruction. For example, 


ADDI *ARO,R2,R2 can be written as ADDI *ARO,R2 
|| MPYI *AR1,RO,RO || MPYI *AR1,RO 


Instructions (applies to all parallel instructions that have a register sec- 
ond operand) affected: ADDI, ADDF, AND, MPYI, MPYF, OR, SUBI, 
SUBF, XOR. 


All commutive operations in parallel instructions can be written in either 
order. For example, the ADDI part of a parallel instruction can be written 
in either of two ways: 


ADDI *ARO,R1,R2 or ADDI R1,*ARO,R2 


The instructions affected are parallel instructions containing any of the 
following: ADDI, ADDF, MPYI, MPYF, AND, OR, XOR. 
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11.3.3 Individual Instruction Descriptions 


Each assembly language instruction for the TMS320C30 is described in this 
section. The instructions are listed in alphabetical order. An example instruc- 
tion precedes the individual instructions to show the special format used and 
explain its content. This example instruction describes the assembler syntax, 
operation, Operands, encoding, description, cycles, status bits, mode bit, and 
examples. 
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EXAMPLE 


Syntax 


Operation 


Operands 
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Example Instruction 


INST <src>,<dst> 
or 


INST1 <src2>,<dst1> 
I} INST2 <srce3>,<dst2> 


Each instruction begins with an assembler syntax expression. Labels may 
be placed either preceding the command (instruction mnemonic) on the 
same line or on the preceding line in the first column. The optional com- 
ment field that concludes the syntax is not included in the syntax ex- 
pression. Space(s) are required between each field (label, command, 
operand, and comment fields). 


The syntax examples illustrate the common one-line syntax and the two- 
line syntax used in parallel addressing. Note that the two vertical bars || that 
indicate a parallel addressing pair can be placed anywhere before the mne- 
monic on the second line. The first instruction in the pair can have a label, 
but the second instruction cannot have a label. 


|src| > dst 
or 

Isrc2| > dst7 
|| sre3 > dst2 


The instruction operation sequence describes the processing that takes 
place when the instruction is executed. For parallel instructions, the opera- 
tion sequence is performed in parallel. Conditional effects of status register 
specified modes will be listed for conditional instructions such as Bcond. 


src general addressing modes (G): 
QO register (Rn,O <n < 27) 


01. direct 
10 indirect 
11 immediate 


dst register (Rn,O <n < 27) 


or 


src2 indirect (disp = O, 1, IRQ, IR1) 
dst? register (Rni,O < n1 < 7) 
src3 register (Rn2,0 < n2 < 7) 
dst2 indirect (disp = O, 1, IRO, IR1) 


Operands are defined according to the addressing mode and/or the type of 
addressing used. Note that indirect addressing uses displacements and the 
index registers. Refer to Section 6 for detailed information on addressing. 


Example Instruction EXAMPLE 


Encoding 
24 23 1615 87 0 


CX A AN 


24 23 1615 87 0 


Encoding examples are shown using general addressing and parallel ad- 
dressing. The instruction pair for the parallel addressing example consists 
of INS1 and INS2. 


Description Instruction execution and its effect on the rest of the processor or memory 
contents are described. Any constraints on the operands imposed by the 
processor or the assembler are discussed. The description parallels and 
supplements the information given by the operation block. 


Cycles 1 
The digit specifies the number of cycles required to execute the instruction. 


Status Bits N Negative Condition Flag. 1 if a negative result is generated, 0 
otherwise. In some instructions, this flag is the MSB of the output. 
For other instructions, this flag is unaffected. 


Z Zero Condition Flag. 1 if a zero result is generated, 0 otherwise. 
For logical and shift instructions, 1 if a zero output is generated, 0 
otherwise. This flag may be unaffected. 


V Overflow Condition Flag. 1 if an integer or floating-point over- 
flow occurs, 0 otherwise. This flag may be unaffected. 


Cc Carry Flag. 1 if a carry or borrow occurs, 0 otherwise. For shift 
instructions, this flag is set to the value of the last bit shifted out; 0 
for a shift count of 0. This flag may be unaffected. 


UF Floating-Point Underflow Condition Flag. If a floating-point 
underflow occurs, 0 otherwise. This flag may be unaffected. 


LV Latched Overflow Condition Flag. 1 if an integer or floating- 
point overflow occurs, unchanged otherwise. This flag may be un- 
affected. 


LUF Latched Floating-Point Underflow Condition Flag. 1 if a 
floating-point underflow occurs, unchanged otherwise. This flag 
may be unaffected. 


The seven condition flags, stored in the status register (ST), are modified 
by the majority of instructions. They provide information as to the properties 
of the result or output of arithmetic or logical operations. 


Mode Bit OVM Overflow Mode Flag. In general, integer operations are affected 
by the OVM flag. 
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EXAMPLE 


Example Instruction 


Example 
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INST @98AEh,R5 


Before Instruction: 


DP = 80h 

R5 = 0766900000h = 2.30562500e+02 

Memory at 8098AEh = 5CDFh = 1.00001107e+00 
LUF LV UFNZVC=+=0000000 


After Instruction: 


DP = 80h 

R5 = 0066900000h = 1.801 26953e+00 

Memory at 8098AEh = 5CDFh = 1.00001107e+00 
LUF LV UF NZVC=0000000 


The sample code presented in the above format shows the effect of the 
code on system pointers (e.g., DP or SP), registers (e.g., R1 or R5), mem- 
ory at specific locations, and the seven status bits. The values given for the 
registers include the leading zeros to show the exponent in floating-point 
operations. Decimal conversions are provided for all register and memory 
locations. The seven status bits are listed in the order in which they appear 
in the assembler and simulator (see Table 11-9 and Section 11.2 for further 
information on these seven status bits). 


Absolute Value of Floating-Point ABSF 


Syntax 
Operation 


Operands 


Encoding 
31 


ABSF <src>,<dst> 
lsrce| > dst 


src general addressing modes (G): 
00O register ( Rn,O <n gs 7) 
O01 direct 
10° indirect 
11 immediate 


dst register (Rn,O <n < 7) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


The absolute value of the src operand is loaded into the dst register. The 
src and dst operands are assumed to be floating-point numbers. 


An overflow occurs if src (man) = 80000000h and sre (exp) = 7Fh. The 
result is dst (man) = 7FFFFFFFh and dst (exp) = 7Fh. 


1 

N 0 

Z 1 if a zero result is generated, O otherwise. 

V 1 if a floating-point overflow occurs, 0 otherwise. 


Cc Unaffected. 

UF 0O 

LV 1 if a floating-point overflow occurs, unchanged otherwise. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


ABSF R4,R7 


Before Instruction: 


R4 = O5C8000F971h 
R7 = 07D251100AEh 
LUF LV UF N ZV 


-9.90337307e+27 
5.48527255e+37 
=0000000 


On il 


After Instruction: 


R4 = 05C8000F971h 
R7 = O05C7FFFO68Fh 


= -9.90337307e+27 
LUF LV UF NZVC 


9.90337307e+27 
=0000000 
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ABSF||STF Parallel ABSF and STF 


Syntax ABSF <src2>,<dst7> 
[| STF <srce3>,<dst2> 
Operation |src2| > dst7 
\| src3 ~ dst2 


Operands src2_ indirect (disp = O, 1, IRO, 1R1) 
dst7 register (Rn1,0 <n1 < 7) 
src3 register (Rn2,0 < n2 < 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


Encoding 
31 24 23 1615 37 0 


1 sfo oo of aot [oo of orcs | meee 


Description _ A floating-point absolute value and a floating-point store are performed in 
parallel. All registers are read at the beginning and loaded at the end of the 
execute cycle. This means that if one of the parallel operations (STF) reads 
from a register and the operation being performed in parallel (ABSF) writes 
to the same register, then STF accepts as input the contents of the register 
before it is modified by the ABSF. 


If src2 and dst2 point to the same location, src2 is read before the write to 
dst2. 


An overflow occurs if src (man) = 80000000h and sre (exp) = 7Fh. The 
result is dst (man) = 7FFFFFFFh and dst (exp) = 7Fh. 


Cycles 1 
Status Bits N 0 
Z 1 if a zero result is generated, 0 otherwise. 
V 1 if a floating-point overflow occurs, 0 otherwise. 
Cc Unaffected. 
UF 0O 


LV 1 if a floating-point overflow occurs, unchanged otherwise. 
LUF Unaffected. 


an Mode Bit OVM Operation not affected by OVM. 
11 
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Parallel ABSF and STF ABSF||STF 


Example ABSF *++AR3(IR1),R4 
|| STF R4,*-AR7(1) 


Before Instruction: 


AR3 = 809800h 

IR1 = OAFh 

R4 = 733C00000h = 1.79750e+02 

AR7 = 8098C5h 

Data at 8098AFh = 58B4000h = -6.118750e+01 
Data at 8098C4h = Oh 

LUF LV UFNZVC=0000000 


After Instruction: 


AR3 = 8098AFh 

IR1 = OAFh 

R4 = 574C00000h = 6.118750e+01 

AR7 = 8098C5h 

Data at 8098AFh = 58B4000h = -6.118750e+01 
Data at 8098C4h = 733C000h = 1.79750e+02 
LUF LV UF NZVC=0000000 
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ABSI Absolute Value of Integer 


Syntax ABSI <srce>,<dst> 

Operation [src| > dst 

Operands sre general addressing modes (G): 
00 register (Rn, 0 <n < 27) 
01 direct 
10. indirect 


11 immediate 
dst register (Rn, O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description — The absolute value of the sre operand is loaded into the dst register. The 
sre and dst operands are assumed to be signed integers. 


An overflow occurs if sre = 80000000h. If ST(OVM) = 1, the result is dst 
= 7FFFFFFFh. If ST(OVM) = 0, the result is dst = OOO00000h. 


Cycles 


Status Bits 0 


4 
N 
Z 1 if a zero result is generated, O otherwise. 

V 1 tf an integer overflow occurs, 0 otherwise. 

C Unaffected. 

UF 0O 

LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 
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Absolute Value of Integer ABSI 


Example ABSI RO,RO 
or ABSI RO 


Before Instruction: 
RO = OFFFFFFCBh = -53 


After Instruction: 
RO = 035h = 53 


Example ABSI *AR1,R3 


Before Instruction: 

AR1 = 20h 

R3 = Oh 

Data at 20h = OFFFFFFCBh = -53 


After Instruction: 


AR1 = 20h 
R3 = 35h = 53 
Data at 20h = OFFFFFFCBh = -53 
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ABSI||STI 


Syntax 
Operation 


Operands 


Encoding 
31 


Parallel ABSI and STI 


ABSI <src2>,<dst7> 
[| STL <sre3>,<dst2> 


lsrc2| > dst7 
|| src3 > dst2 


src2_ indirect (disp = O, 1, IRO, IR1) 


dst? register (Rn1,0 < n1 < 7) 
src3 register (Rn2,0 < n2 s 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 
24 23 16 15 87 0 


Description 


Cycles 
Status Bits 


, Mode Bit 
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An integer absolute value and an integer store are performed in parallel. 
All registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STI) reads from a 
register and the operation being performed in parallel (ABSI) writes to the 
same register, then STI accepts as input the contents of the register before 
it is modified by the ABSI. 


If src2 and dst2 point to the same location, src2 is read before the write to 
dst2. 


An overflow occurs if src = 80000000h. If ST(OVM) = 1, the result is dst 
= 7FFFFFFFh. If ST(OVM) = 0, the result is dst = OOO000000h. 


1 

N 0 

Z 1 if a zero result is generated, O otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 


Cc Unaffected. 

UF 0 

LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


OVM Operation affected by OVM. 


Parallel ABSI and STI ABSI||STI 


Example ABSI *-AR5(1),R5 
|| STI R1,*AR2--(IR1) 


Before Instruction: 


AR5 = 8099E2h 

R5 = Oh 

R1 = 42h = 66 

AR2 = 8098FFh 

IR1 = OFh 

Data at 8099E1h = OFFFFFFCBh = -53 
Data at 8098FFh = 2h = 2 

LUF LV UF NZV C=0 000000 


After Instruction: 


AR5 = 8099E2h 

R5 = 35h = 53 

Ri = 42h = 66 

AR2 = 8098F0h 

IR1 = OFh 

Data at 8099E1h = OFFFFFFCBh = -53 
Data at 8O98FFh = 42h = 66 

LUF LV UF NZVC=0000000 
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ADDC Add Integer with Carry 


Syntax ADDC <sre>,<dst> 

Operation dst + src + C > dst 

Operands sre general addressing modes (G): 
00 register (Rn,O <n < 27) 
01. direct 
10 indirect 


11 immediate 
dst register (Rn,O <n < 27) 
Encoding 
3 24 23 1615 87 0 


1 


Description The sum of the dst and src operands and the C (carry) flag is loaded into 
the dst register. The dst and src operands are assumed to be signed inte- 


gers. 
Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
2 1 if a zero result is generated, 0 otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
Cc 1 if a carry occurs, 0 otherwise. 
UF 0 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 


Example ADDC R1,R5 


Before Instruction: 


R1 = OOFFFF5C25h = -41,947 
R5 = OOFFFFO19Eh = -65,122 
LUF LV UF NZVC=000000 1 


After Instruction: 


R1 = OOFFFF5C25h = -41,947 
R5 = OOFFFE5DC4h = -107,068 
LUF LV UF NZVC=000100 1 
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Add Integer With Carry, 3-Operand ADDC3 


Syntax 
Operation 


Operands 


Encoding 
31 


ADDC3 <src2>,<srce1>,<dst> 
src? + src2 + C > dst 


src? three-operand addressing modes (T): 
00 register ( Rn1,0 < n1 < 27) 
01 indirect (disp = 0, 1, IRO, IR1) 
10 register (Rn1,0 < n1 < 27) 
11 indirect (disp = O, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
0O register (Rn2,0 < n2 < 27) 
O01 register (Rn2,0 < n2 < 27) 
10 indirect (disp = O, 1, IRO, IR1) 
11. indirect (disp = O, 1, IRO, IR1) 


dst register (Rn, O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mede Bit 


The sum of the sre7 and src2 operands and the C (carry) flag is loaded into 
the dst register. The src7, src2, and dst operands are assumed to be signed 
integers. 

1 

N 1 if a negative result is generated, O otherwise. 

Z 1 if a zero result is generated, 0 otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. 

C 1 if a carry occurs, 0 otherwise. 

UF 0 

LV 1 if an integer cverflow occurs, unchanged otherwise. 
LUF Unaffected. 


OVM Operation affected by OVM. 
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ADDC3 Add Integer With Carry, 3-Operand 


Example ADDC3 *AR5++(IRO),R5,R2 
or 
ADDC3 R5,*AR5++(IRO) ,R2 
Before Instruction: 


AR5 = 809908h 


IRO = 10h 
R5 = 066h = 102 
R2 = Oh 


Data at 809908h = OFFFFFFCBh = -53 
LUF LV UF NZVC=00000 0 1 


After Instruction: 


AR5 = 809918h 

IRO = 10h 

R5 = 066h = 102 

R2 = 032h = 50 

Data at 809908h = OFFFFFFCBh = -53 
LUF LV UF NZVC=00000 0 1 


Example ADDC3 R2, R7, RO 


Before Instruction: 


R2 = 02BCh = 700 

R7 = OF82h = 3970 

RO = Oh 

LUF LV UF NZVC=00000 0 1 


After Instruction: 


R2 = 02BCh = 700 

R7 = OF82h = 3970 

RO = 0123Fh = 4671 

LUF LV UF NZVC=0000000 
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Add Floating-Point ADDF 


Syntax 
Operation 


Operands 


Encoding 
31 


ADDF <srce>,<dst> 
dst + src > dst 


sre general addressing modes (G): 
00 register( Rn,O <n <7) 
01. direct 
1 QO indirect 
11 immediate 


dst register (Rn,O <n < 7) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


The sum of the dst and sre operands is loaded into the dst register. The 
dst and src operands are assumed to be floating-point numbers. 


—_ 


N 1 if a negative result is generated, 0 otherwise. 

Zz 1 if a zero result is generated, 0 otherwise. 

V 1 if a floating-point overflow occurs, O otherwise. 
C Unaffected. 


UF §1 if a floating-point underflow occurs, 0 otherwise. 
LV 1 tf a floating-point overflow occurs, unchanged otherwise. 
LUF $1 if a floating-point underflow occurs, unchanged otherwise. 


OVM Operation not affected by OVM. 


ADDF *AR4++(IR1),R5 


Before Instruction: 


AR4 = 809800h 

IR1 = 12Bh 

R5 = 0579800000h = 6.23750e+01 

Data at 80992Bh = 86B2800h = 4.7031250e+02 
LUF LV UF NZVC=0000000 


After Instruction: 


AR4 = 80992Bh 

IR1 = 12Bh 

R5 = 09052C0000h = 5.3268750e+02 

Data at 80992Bh = 86B2800h = 4.7031250e+02 
LUF LV UF NZVC=0000000 


11-29 


ADDF3 Add Floating-Point, 3-Operand 


Syntax ADDF3 <src2>,<srce1>,<dst> 
Operation src? + src2 > dst 
Operands src? three-operand addressing modes (T): 


00 register ( Rni,O < ni < 7) 
01 indirect (disp = O, 1, IRO, IR1) 
10 register ( Rn1,0 <n1 < 7) 
11 indirect (disp = O, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2,0 < n2 < 7) 


O01 register (Rn2,0 < n2 < 7) 
10 indirect (disp = 0, 1, IRO, IR1) 
11 indirect (disp = O, 1, IRO, IR1) 
dst register (Rn,O <n < 7) 
Encoding 
31 24 23 1615 87 0 


Description  Thesum of the src7 and src2 operands is loaded into the dst register. The 
src1, src2, and dst operands are assumed to be floating-point numbers. 


Cycles 1 

Status Bits N 1 if a negative result is generated, 0 otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 
V 1 if a floating-point overflow occurs, 0 otherwise. 
C Unaffected. 


UF $1 if a floating-point underflow occurs, 0 otherwise. 
LV 1 if a floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 


Example ADDF3 R6,R5,Rl1 
or 
ADDF3 R5,R6,R1 


Before Instruction: 


R6 = 086B280000h = 4.7031250e+02 

R5 = 0579800000h = 6.23750e+01 

R1 = Oh 

LUF LV UF NZV C=0000000 


After Instruction: 


R6 = 086B280000h = 4.7031250e+02 

R5 = 0579800000h = 6.23750e+01 

R1 = 09052C0000h = 5.3268750e+02 
LUF LV UF NZVC=000000 0 
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Add Floating-Point, 3-Operand ADDF3 


Example ADDF3 *+AR1(1),*AR7++(IRO) ,R4 


Before Instruction: 


AR1 = 809820h 

AR7 = 8099F0Oh 

IRO = 8h 

R4 = Oh 

Data at 809821h = 7O00FOOOh = 1.28940e+02 
Data at 8099FOh = 34C2000h = 1.27590e+01 
LUF LV UF NZV C=0000000 


After Instruction: 


AR1 = 809820h 

AR7 = 8099F8h 

IRO = 8h 

R4 = 070DB20000h = 1.41695313e+02 

Data at 809821h = 7OOFOOOh = 1.28940e+02 
Data at 8099FOh = 34C2000h = 1.27590e+01 
LUF LV UF NZVC=0000000 
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ADDF3||STF Parallel ADDF3 and STF 


Syntax ADDF3 <src2>,<srct>,<dst1> 
| STF <srce3>,<dst2> 
Operation src? + src2 > dst 
|| src3 > dst2 
Operands src? register (Rn1,0 < n1 < 7) 


sre2 indirect (disp = 0, 1, IRO, IR1) 
dst7 register (Rn2,0 < n2 < 7) 
src3 register (Rn3, 0 < n3 < 7) 
dst2 indirect (disp = O, 1, IRO, IR1) 


Encoding 
31 24 23 1615 87 0 


at _deet | erot | ood | geez] ee 


Description _ A floating-point addition and a floating-point store are performed in paral- 
lel. All registers are read at the beginning and loaded at the end of the ex- 
ecute cycle. This means that if one of the parallel operations (STF) reads 
from a register and the operation being performed in parallel (ADDF3) 
writes to the same register, then STF accepts as input the contents of the 
register before it is modified by the ADDF3. 


If src2 and dst2 point to the same location, src2 is read before the write to 


dst2. 
Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Zz 1 tf a zero result is generated, 0 otherwise. 
V 1 if a floating-point overflow occurs, 0 otherwise. 


C Unaffected. 

UF $1 if a floating-point underflow occurs, 0 otherwise. 

LV 1 tf a floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 
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Parallel ADDF3 and STF ADDF3||STF 


Example ADDF3 *+AR3(IR1),R2,R5 
|| STF R4,*AR2 


Before Instruction: 
AR3 = 809800h 


IR1 = OAD5h 
R2 = 070C800000h = 1.4050e+02 
R5 = Oh 


R4 = 057B400000h = 6.281250e+01 

AR2 = 8098F3h 

Data at 8098A5h = 733C000h = 1.79750e+02 
Data at 8098F3h = Oh 

LUF LV UF NZVC=0 000000 


After Instruction: 


AR3 = 809800h 

IR1 = OA5Sh 

R2 = 070C800000h = 1.4050e+02 

R5 = 0820200000h = 3.20250e+02 

R4 = 057B400000h = 6.281250e+01 

AR2 = 8098F3h 

Data at 8098A5h = 733C000h = 1.79750e+02 
Data at 8098F3h = 57B4000h = 6.28125e+01 
LUF LV UF NZV C=0 000000 
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ADDI Add Integer 


Syntax ADDI <src>,<dst> 

Operation dst + src > dst 

Operands sre general addressing modes (G): 
0 O register (Rn, O < n < 27) 
01 direct 
1 0 indirect 


11 immediate | 
dst register (Rn,O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description  Thesum of the dst and src operands is loaded into the the dst register. The 
dst and src operands are assumed to be signed integers. 


Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
| Z 1 if a zero result is generated, 0 otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
Cc 1 if a carry occurs, 0 otherwise. 
UF 0O 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 
Example ADDI R3,R7 


Before Instruction: 
R3 = OFFFFFFCBh = -53 


R7 = 35h = 53 
LUF LV UF NZVC=0 000000 
After Instruction: 
R3 = OFFFFFFCBh = -53. 
R7 = Oh 


LUF LV UF NZVC=0000100 
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Add Integer, 3-Operand ADDI3 


Syntax ADDI3 <src2>,<srce1>,<dst> 
Operation src? + src2 > dst 
Operands src? three-operand addressing modes (T): 


00 register (Rn1,0 <n1 < 27) 
01 indirect (disp = 0, 1, IRO, IR1) 
10 register (Rn1,0 < ni < 27) 
11 indirect (disp = 0, 1, IRO, 1R1) 


src2. three-operand addressing modes (T): 
0O register (Rn2,0 < n2 < 27) 
01 register (Rn2,0 < n2 < 27) 
10 indirect (disp = O, 1, IRO, IR1) 
11. indirect (disp = O, 1, IRO, IR1) 


dst register (Rn, O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description Thesum of the src7 and src2 operands is loaded into the dst register. The 
src7, src2, and dst operands are assumed to be signed integers. 


Cycles 1 
Status Bits N 1 if a negative result is generated, 0 otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
C 1 if a carry occurs, 0 otherwise. 
UF 0O 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 
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ADDI3 Add Integer, 3-Operand 


Example ADDI3 R4,R7,R5 


Before Instruction: 


R4 = ODCh = 220 

R7 = OAOh = 160 

R5 = 10h = 16 

LUF LV UF NZVC=0000000 


After Instruction: 


R4 = ODCh = 220 
R7 = OAOh = 160 
R5 = 017Ch = 380 
LUF LV UF NZVC=000000 0 


Example ADDI3 *-AR3+(1),*AR6--(IRO) ,R2 


Before Instruction: 


AR3 = 809802h 
AR6 = 809930h 


IRO = 18h 

R2 = 10h = 16 

Data at 809801h = 2AF8h = 11,000 
Data at 809930h = 3A98h = 15,000 


LUF LV UF NZVC=0000000 


After Instruction: 


AR3 = 809852h 

AR6 = 809918h 

IRO = 18h 

R2 = 06598h = 26,000 

Data at 809801h = 2AF8h = 11,000 

Data at 809930h = 3A98h = 15,000 

LUF LV UF NZVC=0000000 
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Parallel ADDI3 and STI ADDI3||STI 


Syntax ADDI3 <srce2>,<sre1>,<dst1> 
[| STI <srce3>,<dst2> 
Operation src? + src2 > dst? 
|| sre3 > dst2 
Operands src? register (Rni,O0 <n1 < 7) 


src2 indirect (disp = O, 1, IRO, IR1) 
dst7 register (Rn2,0 < n2 < 7) 
src3 register (Rn3, 0 < n3 < 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


Encoding 
3 24 23 1615 87 0 


1 
aoe et [ot [os [eee 


Description — An integer addition and an integer store are performed in parallel. All reg- 
isters are read at the beginning and loaded at the end of the execute cycle. 
This means that if one of the parallel operations (STI) reads from a register 
and the operation being performed in parallel (ADDI3) writes to the same 
register, then STI accepts as input the contents of the register before it is 
modified by the ADDIS. 


If src2 and dst2 point to the same location, src2 is read before the write to 


dst2. 
Cycles 1 
Status Bits N 1 if a negative result is generated, 0 otherwise. 
Z 1 if a zero result is generated, O otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
C 1 if a carry occurs, 0 otherwise. 
UF 0 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 


Example ADDI3 *ARO--(IRO),R5,RO0 
|| STI R3,*AR7 


Before Instruction: 
ARO = 80992Ch 


IRO = OCh 

R5 = ODCh = 220 
RO = Oh 

R3 = 35h = 53 


AR7 = 80983Bh 

Data at 80992Ch = 12Ch = 300 

Data at 80983Bh = Oh 

LUF LV UF NZV C=0 000000 
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ADDI3||STI Parallel ADDI3 and STI 


After Instruction: 


ARO = 809920h 

IRO = OCh 

R5 = ODCh = 220 

RO = 208h = 520 

R3 = 35h = 53 

AR7 = 80983Bh 

Data at 80992Ch = 12Ch = 300 

Data at 80983Bh = 35h = 53 

LUF LV UF NZVC=0 000000 
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Bitwise Logical-AND AND 


Syntax 
Operation 


Operands 


Encoding 
31 


AND <src>,<dst> 
dst AND sre > dst 


src general addressing modes (G): 
00 register (Rn,O <n < 27) 
Q1 . direct 
10° indirect 
11 immediate (not sign-extended) 


dst register (Rn,O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


The bitwise logical-AND between the dst and src operands is loaded into 
the dst register. The dst and src operands are assumed to be unsigned in- 
tegers. 


1 
N MSB of the output. 


Z 1 if a zero output is generated, 0 otherwise. 
V 0 

Cc Unaffected. 

UF 0 


LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


AND R1,R2 


Before Instruction: 


R1 = 80h 
R2 = OAFFh 
LUF LV UF NZVC=000000 1 


After Instruction: 


Ri = 80h 
R2 = 80h 
LUF LV UF NZVC=0000001 
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AND3 


Syntax 
Operation 


Operands 


Encoding 
31 


Bitwise Logical-AND, 3-Operand 


AND <srce2>,<src1>,<dst> 
src? AND src2 > dst 


src? three-operand addressing modes (T): 
00 register (Rni,0 <n1 < 27) 
01. indirect (disp = 0, 1, IRO, IR1) 
10 register (Rn1,0 < n1 < 27) 
11 indirect (disp = 0, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2,0 < n2 < 27) 
Q1 register (Rn2,0 < n2 < 27) 
10 indirect (disp = 0, 1, IRO, IR1) 
11 indirect (disp = 0, 1, IRO, IR1) 


dst register (Rn,O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 
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The bitwise logical-AND between the src7 and src2 operands is loaded into 
the dst register. The src7, src2, and dst operands are assumed to be un- 
signed integers. 


1 

N MSB of the output. 

Zz 1 if a zero output is generated, 0 otherwise. 
V 

Cc Unaffected. 

UF 0O 


LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


Bitwise Logical-AND, 3-Operand AND3 


Example AND3 *ARO--(IRO) ,*+AR1,R4 


Before Instruction: 
ARO = 8098F4h 


IRO = 50h 
AR1 = 809951h 
R4 = Oh 


Data at 8098A4h = 30h 

Data at 809952h = 123h 

LUF LV UF NZV C=0000000 
After Instruction: 


ARO = 8098A4h 


IRO = 50h 
AR1 = 809951h 
R4 = 020h 


Data at 8098A4h = 30h 
Data at 809952h = 123h 
LUF LV UF NZV C=0 000000 


Example AND3 *-AR5,R7,R4 


Before Instruction: 


AR5 = 80985Ch 

R7 = 2h 

R4 = Oh 

Data at 80985Bh = OAFFh 

LUF LV UF NZV C=0 000000 


After Instruction: 


AR5 = 80985Ch 

R7 = 2h 

R4 = 2h 

Data at 80985Bh = OAFFh 

LUF LV UF NZVC=0 000000 
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- AND3I|STI Parallel AND3 and STI 


Syntax AND <src2>,<src1>,<dst1> 
|| STl <sre3>,<dst2> 
Operation src? AND srce2 > dst7 
|| src? - dst2 
Operands src? register (Rn1,0 < n1 < 7) 


src2 indirect (disp = O, 1, IRO, IR1) 
dst7 register (Rn2,0 < n2 < 7) 
src3 register (Rn3, 0 < n3 < 7) 
dst2 indirect (disp = 0, 1, IRO, 1R1) 


oe 
24 23 1615 87 0 


Description _ A bitwise logical-AND and an integer store are performed in parallel. All 
registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STI) reads from a 
register and the operation being performed in parallel (AND3) writes to the 
same register, then STI accepts as input the contents of the register before 
it is modified by the AND3. 


If src2 and dst2 point to the same location, src2 is read before the write to 


dst2. 
Cycles 1 
Status Bits N MSB of the output. 
Zz 1 if a zero output is generated, 0 otherwise. 
V 
Cc Unaffected. 
UF 0O 


LV Unaffected. 
LUF Unaffected. 


Mede Bit OVM Operation not affected by OVM. 
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Parallel AND3 and STI AND3||STI 


Example AND3 *+AR1(IRO) ,R4,R7 
|| STI R3,*AR2 


Before Instruction: 
AR1 = 8099Fth 


IRO = 8h 

R4 = 0A323h 
R7 = Oh 

R3 = 35h = 53 


AR2 = 80983Fh 

Data at 8099F9h = 5C53h 

Data at 80983Fh = Oh 

LUF LV UFNZVC=0 000000 


After Instruction: 
AR1 = 8099Ft1h 


IRO = 8h 

R4 = 0A323h 
R7 = 03h 

R3 = 35h = 53 


AR2 = 80983Fh 

Data at 8099F9h = 5C53h 
Data at 80983Fh = 35h = 53 

LUF LV UF NZV C=0 000000 
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ANDN Bitwise Logical-AND with Complement 


Syntax ANDN <src>,<dst> 
Operation dst AND ~src > dst 


Operands src general addressing modes (G): 
register (Rn, O <n < 27) 
01 direct 
10. indirect 
11 immediate (not sign-extended) 


dst register (Rn, O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description _ The bitwise logical-AND between the dst operand and the bitwise logical 
complement (~) of the sre operand is loaded into the dst register. The dst 
and src operands are assumed to be unsigned integers. 


Cycles 1 
Status Bits N MSB of the output. 
Zz 1 if a zero output is generated, 0 otherwise. 
V 
Cc Unaffected. 
UF 0 


LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example ANDN @980Ch,R2 


Before Instruction: 


DP = 80h 

R2 = OC2Fh 

Data at 80980Ch = OA02h 

LUF LV UF NZVC=0000000 


After Instruction: 

DP = 80h 

R2 = 042Dh 

Data at 80980Ch = OAO2h 

LUF LV UF NZV C=0 000000 


11-44 


Bitwise Logical-ANDN, 3-Operand ANDN3 


Syntax 
Operation 


Operands 


Encoding 
31 


ANDN3 <src2>,<src1>,<dst> 
sre? AND ~src2 > dst 


src? three-operand addressing modes (T): 
00 register (Rni,0 <n1 < 27) 
01 indirect (disp = 0, 1, IRO, IR1) 
10 register (Rn1,0 < n1 < 27) 
11 indirect (disp = 0, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2,0 < n2 < 27) 
01 register (Rn2,0 < n2 < 27) 
10 indirect (disp = O, 1, IRO, IR1) 
11. tndirect (disp = O, 1, 100, IR1) 


dst register (Rn, O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


The bitwise logical-AND between the src7 operand and the bitwise logical 
complement (~) of the src2 operand is loaded into the dst register. The 
src7, sre2, and dst operands are assumed to be unsigned integers. 


MSB of the output. 
1 if a zero output is generated, 0 otherwise. 


0 
Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


1 

N 

Zz 

V 

C Unaffected. 
UF 

LV 


ANDN3 R5,R3,R7 


Before Instruction: 


R5 = OA02h 
R3 = OC2Fh 
R7 = Oh 


LUF LV UF NZVC=0000000 


After Instruction: 


R5 = OAO2h 
R3 = OC2Fh 
R7 = 042Dh 


LUF LV UF NZVC=0000000 
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ANDN3 Bitwise Logical-ANDN, 3-Operand 


Example ANDN3 R1,*AR5++(IRO),RO 


Before Instruction: 


R1 = OCFh 

AR5 = 809825h 

IRO = 5h 

RO = Oh 

Data at 809825h = OFFFh 

LUF LV UF NZVC=0 000000 


After Instruction: 


R1 = OCFh 

AR5 = 80982Ah 

IRO = 5h 

RO = OF30h 

Data at 809825h = OFFFh 

LUF LV UF NZVC=O0 000000 
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Arithmetic Shift ASH 


Syntax ASH <count>,<dst> 


Operation If (count > O): 
dst << count > dst 
Else: 
dst >> |count| > dst 


Operands count general addressing modes (G): 
00 register (Rn,O <n < 27) 
01 direct 
10 indirect 
11 immediate 


dst register (RN, O <n s 27) 
Encoding 
31 24 23 1615 87 0 


Description — The seven least-significant bits of the count operand are used to generate 
the two’s-complement shift count of up to 32 bits. 


lf the count operand is greater than zero, the dst operand is left-shifted by 
the value of the count operand. Low-order bits shifted in are zero-filled, 
and high-order bits are shifted out through the C (carry) bit. 


Arithmetic left-shift: 
C<«dst«0O 


If the count operand is less than zero, the dst operand is right-shifted by the 
absolute value of the count operand. The high-order bits of the dst operand 
are sign-extended as it is right-shifted. Low-order bits are shifted out 
through the C (carry) bit. 


Arithmetic right-shiftt: 
> sign of dst > C 


If the count operand is zero, no shift is performed, and the C (carry) bit is 
set to 0. The count and dst operands are assumed to be signed integers. cn 
11 


Cycles 


1 

Status Bits N MSB of the output. 

Z 1 if a zero output is generated, 0 otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. 

C Set to the value of the last bit shifted out. O for a shift count of 0. 
Unaffected if dst is not RO - R7. 

UF 0 

LV = 1+=if an integer overflow occurs, unchanged otherwise. 

LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
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ASH Arithmetic Shift 


Example ASH R1,R3 


Before Instruction: 

R1 = 10h = 16 

R3 = OAEOOOh 

LUF LV UF NZV C=0000000 


After Instruction: 


R1 = 10h 
R3 = OEOQOO00000h 
LUF LV UF NZVC=01010 1 0 


Example ASH @98C3h,R5 


Before Instruction: 


DP = 80h 

R5 = OAECOOO00Th 

Data at 8098C3h = OFFE8 = -24 

LUF LV UF NZVC=0000000 


After Instruction: 


DP = 80h 

R5 = OFFFFFFAEh 

Data at 8098C3h = OFFE8 = -24 

LUF LV UFNZVC=0001001 
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Arithmetic Shift, 3-Operand ASH3 


Syntax ASH3 <count>,<srce>,<dst> 
Operation If (count > QO): 
sre << count > dst 
Else: 


src >> |count| > dst 


Operands count three-operand addressing modes (T): 
register (Rn1,0 < n1 < 27) 
direct (disp = O, 1, IRO, IR1) 
register (Rn1,0 < n1 < 27) 
indirect (disp = O, 1, IRO, IR1) 


01 
0 
1 
operand addressing modes (T): 
0 
1 


sre thre 
register (Rn2, 0 < n2 < 27) 
register (Rn2, 0 < n2 < 27) 
10 indirect (disp = O, 1, IRO, IR1) 
11 indirect (disp = O, 1, 100, IR1) 


dst register (Rn, O <n < 27) 


1 
1 
e- 
0 
0 


Encoding 
31 24 23 1615 87 0 


Description _ The seven least-significant bits of the count operand are used to generate 
the two’s-complement shift count of up to 32 bits. 


If the count operand is greater than zero, the src operand is left-shifted by 
the value of the count operand. Low-order bits shifted in are zero-filled, 
and high-order bits are shifted out through the C (carry) bit. 


Arithmetic left-shift: 

C+ sre“ 0 
If the count operand is less than zero, the sre operand is right-shifted by the 
absolute value of the count operand. The high-order bits of the src operand 


are sign-extended as it is right-shifted. Low-order bits are shifted out 
through the C (carry) bit. 


Arithmetic right-shift: 
> sign of dst > C 


If the count operand is zero, no shift is performed, and the C (carry) bit is 
set to 0. The count, src, and dst operands are assumed to be signed inte- 
gers. 


Cycles 1 
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ASH3 Arithmetic Shift, 3-Operand 


Status Bits N MSB of the output. 

Zz 1 if a zero output is generated, O otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. 

C Set to the value of the last bit shifted out. O for a shift count of 0. 
Unaffected if dst is not RO - R7. 

UF 0O 

LV 1 if an integer overflow occurs, unchanged otherwise. 

LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example ASH3 *AR3--(1),R5,RO 


Before Instruction: 


AR3 = 809921h 

R5 = O2BOh 

RO = Oh 

Data at 809921h = 10h 
LUF LV UF NZVC 


After Instruction: 


AR3 = 809920h 

R5 = 000002BOh 

RO = 02B00000h 

Data at 809921h = 10h = 16 

LUF LV UF NZV C=0 000000 


Example ASH3 R1,R3,R5 


Before Instruction: 


Ri = OFFFFFFF8h = -8 

R3 = OFFFFCBOOh 

R5 = Oh 

LUF LV UF NZVC=0000000 


After Instruction: 


R1 = OFFFFFFF8h = -8 

R3 = OFFFFCBOOh 

R5 = OFFFFFFCBh 

LUF LV UE NZVC=0001000 
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Parallel ASH3 and STI ASH3||STI 


Syntax ASH3 <count>,<srce2>,<dst1> 
| STI <srce3>,<dst2> 


Operation If (count > OQ): 
src2 << count > dst? 
Else: 
src2 >> |count| > dst7 
ll sre3 > dst2 


Operands count register (Rni,0 <n1 <7 
src2 indirect (disp = 0, 1, IRO, 
dst? register (Rn2,0 <n2 <7 
sre3 register (Rn3,0 < n3 < 7 
dst2 indirect (disp = O, 1, IRO, IR1) 


Encoding 
31 24 23 1615 87 0 


el jdeet_| count | roa | ana] ee 


Description — The seven least-significant bits of the count operand register are used to 
generate the two’s-complement shift count of up to 32 bits. 


If the count operand is greater than zero, the dst operand is left-shifted by 
the value of the count operand. Low-order bits shifted in are zero-filled, 
and high-order bits are shifted out through the C (carry) bit. 


Arithmetic left-shift: 

C= sro2 -— 0 
lf the count operand is less than zero, the dst operand is right-shifted by the 
absolute value of the count operand. The high-order bits of the dst operand 
are sign-extended as it is right-shifted. Low-order bits are shifted out 
through the C (carry) bit. 
Arithmetic right-shift: 

+ sign of src2 > C 


If the count operand is zero, no shift is performed, and the C (carry) bit is 
set to 0. The count and dst operands are assumed to be signed integers. 


All registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STI) reads from a 
register and the operation being performed in parallel (ASH3) writes to the 
same register, then STI accepts as input the contents of the register before 
it is modified by the ASHS. 


If src2 and dst2 point to the same location, src2 is read before the write to 
dst2. 


Cycles 1 
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ASH3]||STI 


Status Bits 


Mode Bit 


Example 
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Parallel ASH3 and STI 


N MSB of the output. 

Zz 1 if a zero Output is generated, O otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. | 

Cc Set to the value of the last bit shifted out. O for a shift count of 0. 
UF 0 

LV 1 if an integer overflow occurs, unchanged otherwise. 

LUF Unaffected. 


OVM Operation not affected by OVM. 


ASH3 R1,*AR6++(IR1),RO 
|| STI R5,*AR2 


Before Instruction: 
AR6 = 809900h 


IR1 = 8Ch 

Ri = OFFE8h = -24 
RO = Oh 

R5 = 35h = 53 


AR2 = 8098A2h 

Data at 809900h = OAEQOOOOOh 

Data at 8098A2h = Oh 

LUF LV UF NZVC=0 000000 


After Instruction: 


AR6 = 80998Ch 

IR1 = 8Ch 

R1 = OFFE8h = -24 

RO = OFFFFFFAEh 

R5 = 35h = 53 

AR2 = 8098A2h 

Data at 809900h = OAEOOO000h 

Data at 8098A2h = 35h = 53 

LUF LV UF NZ2ZVC=0001000 


Branch Conditionally (Standard) Bcond 


Syntax 


Operation 


Operands 


aes 


Bcond <src> 


If cond is true: 
If src is in register addressing mode (Rn O<n27), 
src > PC. 
If src is in PC-relative mode (label or address), 
displacement + PC + 1 7 PC. 
Else, continue. 


src conditional-branch addressing modes (B): 
0 register 
1 PC-relative 


24 23 1615 87 0 


oT Ter oslo’ e oe] ame register or displacement 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


Bcond signifies a standard branch that executes in four cycles. A branch 
is performed if the condition is true. If the src operand is expressed in reg- 
ister addressing mode, the contents of the specified register are loaded into 
the PC. If the src operand is expressed in PC-relative mode, the assembler 
generates a displacement: displacement = label - (PC of branch instruction 
+ 1). This displacement is stored as a 16 bit signed integer in the 16 least 
significant bits of the branch instruction word. This displacement is added 
to the PC of the branch instruction plus 1 to generate the new PC. 


The TMS320C30 provides 20 condition codes that can be used with this 
instruction (see Section 11.2 for a list of condition mnemonics, sncoaing, 
and flags). 


4 


N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


BZ RO 


Before Instruction: 


PC = 2BO0h 
RO = O0O003FFOOh 
LUF LV UF NZVC=0000000 


After Instruction: 


PC = 3FFOOh 
RO = OOO3FFOOh 
LUF LV UF NZV C=0000000 
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BcondD 


Syntax 


Operation 


Operands 


Encoding 
31 


Branch Conditionally (Delayed) 


BcondD <src> 


If cond is true: 
If src is in register addressing mode (Rn O0<n<27), 
src > PC. 
If src is in PC-relative mode (label or address), 
displacement + PC + 3 — PC. 
Else, continue. 


sre conditional-branch addressing modes (B): 
0 register 
1 PC-relative 


24 23 1615 87 0 


jot tos ojajo a of] cons register or displacement 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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BcondD signifies a delayed branch that allows the three instructions after 
the delayed branch to be fetched before the PC is modified. The effect is a 
single-cycle branch. 


A branch is performed if the condition is true. If the src operand is ex- 
pressed in register addressing mode, the contents of the specified register 
are loaded into the PC. If the src operand is expressed in PC-relative mode, 
the assembler generates a displacement: displacement = label - (PC of 
branch instruction + 3). This displacement is stored as a 16 bit signed in- 
teger in the 16 least significant bits of the branch instruction. This dis- 
placement is added to the PC of the branch instruction plus 3 to generate. 
the new PC. The TMS320C30 provides 20 condition codes that can be 
used with this instruction (see Section 11.2 for a list of condition mne- 
monics, encoding, and flags). 


1 


N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 
BNZD 36 (36 = 24h) 


Before Instruction: 
PC = 50h 
LUF LV UF NZV C=0 000000 


After Instruction: 
PC = 77h 
LUF LV UF NZV C=0 000000 


Branch Unconditionally (Standard) BR 


Syntax BR <src> 

Operation src > PC 

Operands src \long-immediate addressing mode 

Encoding 

31 24 23 16 15 87 0 


Description BR signifies a standard branch that executes in four cycles. An uncondi- 
tional branch is performed. The src operand is assumed to be a 24-bit un- 
signed integer. Note that bit 24 = O for a standard branch. 


Cycles 4 


Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
Example BR 805Ch 


Before Instruction: 


PC = 80h 
LUF LV UF NZVC=0000000 


After Instruction: 


PC = 805Ch 
LUF LV UF NZVC=0000000 
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BRD 


Syntax 
Operation 
Operands 
Encoding 
31 


Branch Unconditionally (Delayed) 


BRD <src> 
src > PC 


sre long-immediate addressing mode 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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BRD signifies a delayed branch that allows the three instructions after the 
delayed branch to be fetched before the PC is modified. The effect is a 
single-cycle branch. 


_An unconditional branch is performed. The src operand is assumed to be a 


24-bit unsigned integer. Note that bit 24 = 1 for a delayed branch. 


4 
N Unaffected. 
Z Unaffected. 
V Unaffected. 
Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


BRD 2Ch 


Before Instruction: 

PC = 1Bh 

LUF LV UF NZV C=0 000000 
After Instruction: 


PC = 2Ch 
LUF LV UF NZVC=0000000 


Call Subroutine CALL 


Syntax CALL <sre> 
Operation Next PC ~ *++SP 
src + PC 
Operands src \|ong-immediate addressing mode 
Encoding 
31 24 23 1615 87 0 


Description  Acall is performed. The next PC value is pushed onto the system stack. The 
src operand is loaded into the PC. The sre operand is assumed to be a 
24-bit unsigned immediate operand. 


Cycles 4 

Status Bits N Unaffected. 
Zz Unaffected. 
V Unaffected. 
C Unaffected. 


UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
Example CALL 123456h 

Before Instruction: 

PC = 5h 

SP = 809801h 


LUF LV UF NZVC=0000000 


After Instruction: 


PC = 123456h 

SP = 809802h 

Data at 809802h = 6h 

LUF LV UF NZVC=0000000 
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CALLcond 


Syntax 


Operation 


Operands 


Encoding 
31 


Call Subroutine Conditionally 


CALLcond <src> 


If cond is true: 
Next PC > *++SP 
If src is in register addressing mode (Rn O<n<27), 
src > PC. 
If src is in PC-relative mode (label or address), 
displacement + PC + 1 — PC. 
Else, continue. 


src conditional-branch addressing modes (B): 
0 register 
1 PC-relative 


24 23 1615 87 0 


011100) 0000 | cond register or displacement 


Description 


Cycles 
Status Bits 


11 


Mode Bit 
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A call is performed if the condition is true. If the condition is true, the next 
PC value is pushed onto the system stack. If the src operand is expressed 
in register addressing mode, the contents of the specified register are loaded 
into the PC. If the src operand is expressed in PC-relative mode, the as- 
sembler generates a displacement: displacement = label - (PC of call in- 
struction + 1). This displacement is stored as a 16-bit signed integer in the 
16 least significant bit of the call instruction word. This displacement is 
added to the PC of the call instruction plus 1 to generate the new PC. 


The TMS320C30 provides 20 condition codes that can be used with this 
instruction (see Section 11.2 for a list of condition mnemonics, encoding, 
and flags). 


5 


N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


Call Subroutine Conditionally CALLcond 


Example CALLNZ R5 


Before Instruction: 


PC = 123h 

SP = 809835h 

R5 = 789h 

LUF LV UF NZVC=0000000 


After Instruction: 


PC = 789h 
SP = 809836h 
R5 = 789h 


Data at 809836h = 124h 
LUF LV UF NZVC=0 000000 
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CMPF Compare Floating-Point 


Syntax CMPF <src>,<dst> 

Operation dst - src 

Operands src general addressing modes (G): 
0O register (Rn,O <n <7) 
01 direct 
10 indirect 


11 immediate 
dst register (Rn,O <n < 7) 


Encoding 
31 24 23 1615 87 0 


Description The src operand is subtracted from the dst operand. The result is not loaded 
into any register, thus allowing for nondestructive compares. The dst and 
src operands are assumed to be floating-point numbers. 


Cycles 


Status Bits 1 if a negative result is generated, O otherwise. 

1 if a zero result is generated, 0 otherwise. 

1 if a floating-point overflow occurs, 0 otherwise. 
Unaffected. 

1 if a floating-point underflow occurs, 0 otherwise. 

LV $1 if a floating-point overflow occurs, unchanged otherwise. 


LUF 1 if a floating-point underflow occurs, unchanged otherwise. 
Mede Bit OVM Operation not affected by OVM. 


CO<N2 > 
"1 


Example CMPF *+AR4,R6 


Before Instruction: 


AR4 = 8098F2h 

R6 = 070C800000h = 1.4050e+02 

Data at 8098F3h = 070C8000h = 1.4050e+02 
LUF LV UF NZVC=000000 0 


After Instruction: 


AR4 = 8098F2h 

R6 = 070C800000h = 1.4050e+02 

Data at 8098F3h = 070C8000h = 1.4050e+02 
LUF LV UF NZVC=00001 00 
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Compare Floating-Point, 3-Operand CMPF3 


Syntax CMPF3 <srce2>,<src7> 
Operation sre? - src2 
Operands src? three-operand addressing modes (T): 


00 register (Rn1,0 <n1 < 7) 
01 indirect (disp = O, 1, IRO, IR1) 
10 register (Rn1,0 < n1 <s 7) 
11 indirect (disp = 0, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 


QO register (Rn2,0 < n2 < 7) 
01 register (Rn2,0 < n2 < 7) 
1Q indirect (disp = O, 1, I[RO, IR1) 
11. indirect (disp = O, 1, IRO, IR1) 
Encoding 
31 24 23 1615 87 0 


Description The src2 operand is subtracted from the src7 operand. The result is not 
loaded into any register, thus allowing for nondestructive compares. The 
src? and src2 operands are assumed to be floating-point numbers. Al- 
though this instruction has only two operands, it is designated as a three 
operand instruction since operands are specified in the three operand for- 


mat. 
Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 
V 1 if a floating-point overflow occurs, 0 otherwise. 
C Unaffected. 
UF $1 if a floating-point underflow occurs, 0 otherwise. 
LV 1 if a flaoting-point overflow occurs, unchanged otherwise. 


LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 


Example CMPF3 *AR2,*AR3--(1) 


Before Instruction: 


AR2 = 809831h 

AR3 = 809852h 

Data at 809831h = 77A7000h = 2.5044e+02 
Data at 809852h = 57A2000h = 6.253125e+01 
LUF LV UF NZV C=0000000 


After Instruction: 


AR2 = 809831h 

AR3 = 809851h 

Data at 809831h = 77A7000h = 2.5044e+02 
Data at 809852h = 57A2000h = 6.2531 25e+01 
LUF LV UF NZ2ZVC=0001000 
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CMPI 


Syntax 
Operation 


Operands 


Encoding 
31 


Compare Integer 


CMPI <src>,<dst> 
dst - src 


src general addressing modes (G): 
00 register (Rn, O <n < 27) 
01 direct 
10. indirect 
11 immediate 


dst register (Rn, O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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The src operand is subtracted from the dst operand. The result is not loaded 
into any register, thus allowing for nondestructive compares. The dst and 
src operands are assumed to be signed integers. 


1 
N 1 if a negative result is generated, 0 otherwise. 

Zz 1 if a zero result is generated, 0 otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. 

Cc 1 if a borrow occurs, 0 otherwise. 

UF 0 

LV 1 if an integer overflow occurs, unchanged otherwise. 


LUF Unaffected. 
OVM Operation not affected by OVM. 


CMPI R3,R7 


Before Instruction: 
R3 = 898h = 2200 


R7 = 3E8h = 1000 
LUF LV UF NZVC=0000000 


After Instruction: 


R3 = 898h = 2200 
R7 = 3E8h = 1000 
LUF LV UF NZVC=0001000 


Compare Integer, 3-Operand CMPI3 


Syntax CMPI3 <src2>,<src1> 
Operation sre? - src2 
Operands src? three-operand addressing modes (T): 


00 register (Rn1,0 < n1 < 27) 
01 indirect (disp = O, 1, IRO, IR1) 
10 register (Rn1,0 <n1 < 27) 
11 indirect (disp = O, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2,0 < n2 < 27) 


01 register (Rn2,0 < n2 < 27) 
10 indirect (disp = 0, 1, IRO, IR1) 
11. indirect (disp = O, 1, IRO, IR1) 
Encoding | 
31 24 23 1615 87 0 


Description The src2 operand is subtracted from the src7 operand. The result is not 
loaded into any register, thus allowing for nondestructive compares. The 
src? and src2 operands are assumed to be signed integers. Although this 
instruction has only two operands, it is designated as a three operand in- 
struction since operands are specified in the three operand format. 


Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
Cc 1 if a borrow occurs, 0 otherwise. 
UF 0 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example CMPI3 R7,R4 


Before Instruction: 


R7 = O3E8h = 1000 
R4 = 0898h = 2200 
LUF LV UF NZV C=0 000000 


After Instruction: 


R7 = O3E8h = 1000 
R4 = 0898h = 2200 
LUF LV UF NZVC=0000000 
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DBcond 


Syntax 


Operation 


Operands 


—— 


Decrement and Branch Conditionally (Standard) 


DBcond <ARn>,<src> 


ARn - 1 ~ ARn 
If cond is true and ARn > 0: 
If src is in register addressing mode (Rn O<n<27) 
sre > PC. 
If sre is in PC-relative mode (label or address) 
displacement + PC + 1 —> PC. 
Else, continue. 


sre conditional-branch addressing modes (B): 
0 register 
1 PC-relative 


ARn register (0 <n < 7) 


24 23 1615 87 0 


oT Te as] am [ol me regieter oF displacement 


Description 


Cycles 
Status Bits 


Mode Bit 
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DBcond signifies a standard branch that executes in four cycles. The spe- 
cified auxiliary register is decremented and a branch its performed if the 
condition is true and the specified auxiliary register is greater than or equal 
to zero. 


The auxiliary register is treated as a 24-bit signed integer. The most-signi- 
ficant eight bits are unmodified by the decrement operation. The compar- 
ison of the auxiliary register uses only the 24 least-significant bits of the 
auxiliary register. Note that the branch condition does not depend on the 
auxiliary register decrement. 


If the src operand is expressed in register addressing mode, the contents of 
the specified register are loaded into the PC. If the src operand is expressed 
in PC-relative addressing mode, the assembler generates a displacement: 
displacement = label - (PC of branch instruction + 1). This integer is 
stored as a 16 bit signed integer in the 16 least significant bits of the branch 
instruction word. This displacement is added to the PC of the branch in- 
struction plus 1 to generate the new PC. 


The TMS320C30 provides 20 condition codes that can be used with this 
instruction (see Section 11.2 for a list of condition mnemonics, encoding, 
and flags). 


4 


N Unaffected. 
Zz Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


DBcond 


Decrement and Branch Conditionally (Standard) 


Example 


DBLT AR3,R2 


Before Instruction: 


PC = 5Fh 
AR3 = 12h 
R2 = 9Fh 


LUF LV UFNZVC=0001000 


After Instruction: 


PC = 9Fh 
AR3 = 11h 
R2 = 9Fh 


LUF LV UF NZVC=0001000 


11-65 


DBcondD 


Syntax 


Operation 


Operands 


Encoding 
31 


Decrement and Branch Conditionally (Delayed) 


DBcondD <ARn>,<src> 


ARn - 1 ~ ARn 
If cond is true: 
If src is in register addressing mode (Rn 0<n<27) 
sre + PC 
If src is in PC-relative mode (label or address) 
displacement + PC + 3 — PC. 
Else, continue. 


src conditional-branch addressing modes (B): 
0) register 
1 PC-relative 


ARn register (0 <n < 7) 


24 23 1615 87 0 


jot aot s[e} ann [a] cond register or displacement 


Description 


Cycles 
Status Bits 


Mode Bit 
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DBcondD signifies a delayed branch that allows the three instructions after 
the delayed branch to be fetched before the PC is modified. The effect is a 
single-cycle branch. The specified auxiliary register is decremented and a 
branch is performed if the condition is true and the specified auxiliary reg- 
ister greater than or equal to zero. 


The auxiliary register is treated as a 24-bit signed integer. The most-signi- 
ficant eight bits are unmodified by the decrement operation. The compar- 
ison of the auxiliary register uses only the 24 least-significant bits of the 
auxiliary register. Note that the branch condition does not depend on the 
auxiliary register decrement. 


If the sre operand is expressed in register addressing mode, the contents of 
the specified register are loaded into the PC. If the src is expressed in 
PC-relative addressing, the assembler generates a displacement: displace- 
ment = label - (PC of branch instruction + 3). This displacement is added 
to the PC of the branch instruction plus 3 to generate the new PC. Note 
that bit 21 = 1 for a delayed branch. 


The TMS320C30 provides 20 condition codes that can be used with this 
instruction (see Section 11.2 for a list of condition mnemonics, encoding, 
and flags). 


1 


N Unaffected. 
Zz Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


Decrement and Branch Conditionally (Delayed) DBcondD 


Example DBZD AR5,$+110h 


Before Instruction: 

PC = Oh 

AR5 = 67h 

LUF LV UF NZVC=0000100 


After Instruction: 


PC = 110h 
AR5 = 66h 
LUF LV UF NZVC=00001 00 
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FIX Floating-Point to Integer Conversion 


Syntax FIX <srce>,<dst> 

Operation fix(src) - dst 

Operands src general addressing modes (G): 
00 register (Rn,O <n < 7) 
01 direct 
10. indirect 


11 immediate 
dst register (Rn,O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description _ The floating-point operand src is converted to the nearest integer less than 
or equal to it in absolute value, and the result is loaded into the dst register. 
The sre operand is assumed to be a floating-point number and the dst op- 
erand a signed integer. 


The exponent field of the result register (if it has one) is not modified. 


Integer overflow occurs when the floating-point number is too large to be 
represented as a 32-bit two’s-complement integer. In the case of integer 
overflow, the result will be saturated in the direction of overflow. 


Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
Cc Unaffected. 
UF 0 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example FIX R1,R2 


Before Instruction: 

R1 = OA282CCCCCh = -1.3454e+3 

R2 = Oh 

LUF LV UF NZ VC=0 000000 


After Instruction: 


R1 = OA282CCCCCh = -13454e+3 
R2 = 541h = 1345 
LUF LV UF NZVC=0000000 


11-68 


Parallel FIX and STI FIX||STI 


Syntax FIX <src2>,<dst7> 
|| ST| <sre3>,<dst2> 
Operation fix(src2) — dst? 
|| src3 — dst2 


Operands src2 indirect (disp = 0, 1, IRO, IR1) 
dst? register (Rn1,0 < n1 < 7) 
src3 register (Rn2,0 < n2 < 7) 
dst2 indirect (disp = O, 1, IRO, IR1) 


Encoding 
31 24 23 1615 87 0 


FRET CCRC ECCI A 


Description _ A floating-point to integer conversion is performed. All registers are read 
at the beginning and loaded at the end of the execute cycle. This means 
that if one of the parallel operations (STI) reads from a register, and the 
operation being performed in parallel (FIX) writes to the same register, then 
STI accepts as input the contents of the register before it is modified by FIX. 


If src2 and dst2 point to the same location, src2 is read before the write to 
dst2. 


Integer overflow occurs when the floating-point number is too large to be 
represented as a 32-bit two’s-complement integer. In the case of integer 
overflow, the result will be saturated in the direction of overflow. 


Cycles 1 
Status Bits N 1 if a negative result is generated, 0 otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
C Unaffected. 
UF 0O 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 
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FIX||STI | Parallel FIX and STI 


Example FIX *++AR4(1),R1 
{| STI RO,*AR2 


Before Instruction: 


AR4 = 8098A2h 

R1 = Oh 

RO = ODCh = 220 

AR2 = 80983Ch 

Data at 8098A3h = 733C000h = 1.7950e+02 
Data at 80983Ch = Oh 

LUF LV UF NZVC=0 000000 


After Instruction: 


AR4 = 8098A3h 

R1 = OB3h = 179 

RO = ODCh = 220 

AR2 = 80983Ch 

Data at 8098A3h = 733C000h = 1.79750e+02 
Data at 80983Ch = ODCh = 220 

LUF LV UF NZV C=0 000000 
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Integer to Floating-Point Conversion FLOAT 


Syntax FLOAT <src>,<dst> 

Operation float(src) > dst 

Operands sre general addressing modes (G): 
00 register (Rn, O <n < 27) 
01 direct 
10. indirect 


11 immediate 
dst register (Rn,O <n < 7) 
Encoding 
31 24 23 1615 87 0 


povorsa] ae Pe 


Description — The integer operand src is converted to the floating-point value equal to it, 
and the result loaded into the dst register. The src operand is assumed to 
be a signed integer, and the dst operand a floating-point number. 


Cycles 1 


Status Bits 1 if a negative result is generated, O otherwise. 


N 
Z 1 if a zero result is generated, 0 otherwise. 
V 


Cc Unaffected. 
UF 0 

LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
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FLOAT 


Example 
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Integer to Floating-Point Conversion 


FLOAT *++AR2(2),R5 


Before Instruction: 


AR2 = 809800h 

R5 = 034C2000h = 1.27578125e+01 

Data at 809802h = OAEh = 174 

LUF LV UFNZVC=0 000000 


After Instruction: 


AR2 = 809802h 

R5 = 072E00000h = 1.74e+02 

Data at 809802h = OAEh = 174 

LUF LV UF NZVC=0000000 


Parallel FLOAT and STF FLOAT||STF 


Syntax FLOAT <src2>,<dst1> 
| STF <src3>,<dst2> 
Operation float(src2) > dst7 
|| sre3 > dst2 
Operands src2 indirect (disp = O, 1, IRO, IR1) 


dst7 register (Rn1,0 < n1 < 7) 
src3 register (Rn2,0 < n2 < 7) 
dst2 register (disp = 0, 1, IRO, IR1) 


Encoding 
3 24 23 1615 87 0 


1 


Description — An integer to floating-point conversion is performed. All registers are read 
at the beginning and loaded at the end of the execute cycle. This means that 
if one of the parallel operations (STF) reads from a register and the opera- 
tion being performed in parallel (FLOAT) writes to the same register, then 
STF accepts as input the contents of the register before it is modified by 


FLOAT. 
If sre2 and dst2 point to the same location, src2 is read before the write to 
dst2. 
Cycles 1 
Status Bits N 1 if a negative result is generated, 0 otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 0 
Cc Unaffected. 
UF 0O 


LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example FLOAT *+AR2(IRO),R6 
|| STF R7,*AR1 


Before Instruction: 


AR2 = 8098C5h 

IRO = 8h 

R6 = Oh 

R7 = 034C200000h = 1.27578125e+01 
AR1 = 809933h 

Data at 8098CDh = OAEh = 174 

Data at 809933h = Oh 

LUF LV UF NZV C=0 000000 
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FLOAT||STF | Parallel FLOAT and STF 
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After Instruction: 


AR2 = 8098C5h 

IRO = 8h 

R6 = 072E000000h = 1.740e+02 

R7 = 034C200000h = 1.27578125e+01 

AR1 = 809933h 

Data at 8098CDh = OAEh = 174 

Data at 809933h = 034C2000h = 1.27578125e+01 
LUF LV UF NZVC=0000000 


Interrupt Acknowledge IACK 


Syntax JACK <src> 
Operation Perform a dummy read operation with |ACK = 0. 
At end of dummy read, set IACK to 1. 

Operands src general addressing modes (G): 

01 direct 

10 indirect 
Encoding 
31 24 23 1615 87 0 


Description A dummy read operation is performed with |ACK = 0. At the end of the 
dummy read, IACK is set to 1. This instruction can be used to generate an 
external interrupt acknowledge. If the address specified is off-chip, a read 
operation from that address is performed. The IACK signal and the address 
can then be used to signal interrupt acknowledge to external devices. The 
data read by the processor is unused. 


Cycles 1 


Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example IACK *AR5 


Before Instruction: 


IACK = 1 
PC = 300h 
LUF LV UF NZV C=0000000 


After Instruction: 


IACK = 1 
PC = 301th 
LUF LV UF NZVC=0000000 
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IDLE 


Syntax 


Operation 


Operands 
Encoding 
31 


Idle Until Interrupt 


IDLE 

1 > ST(GIE) 

Next PC ~ PC 
Idle until interrupt. 


None 


24 23 1615 87 0 


}0 0 0/0 0 1100;0000000000000000000000 0 


Description 


Cycles 
Status Bits 


Mode Bit 
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The global interrupt enable bit is set, the next PC value is loaded into the 
PC, and the CPU idles until an interrupt is received. When the interrupt is 
received, the contents of the PC are pushed on the active system stack. 


1 

N Unaffected. 
Zz Unaffected. 
V Unaffected. 
Cc Unaffected. 


UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


Load Floating-Point Exponent LDE 


Syntax LDE <srce>,<dst> 

Operation src(exp) > dst(exp) 

Operands src general addressing modes (G): 
OO register (Rn,O <n < 7) 
01. direct 
10. indirect 


11 immediate 
dst register (Rn,O <n < 7) 


Encoding 
31 24 23 1615 87 0 


Ce ee ee eee 


Description The exponent field of the src operand is loaded into the exponent field of 
the dst register. No modification of the dst register mantissa field is made 
unless the value of the exponent loaded is the reserved value of the expo- 
nent for zero in the precision of the src operand. Then the mantissa field 
of the dst register is set to zero. The src and dst operands are assumed to 
be floating-point numbers. 


Cycles 1 
Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 
Mode Bit OVM Operation not affected by OVM. 
Example LDE RO,R5 


Before Instruction: 


RO = 0200056F30h = 4.00066337e+00 
R5 = OAO56FE332h = 1.06749648e+03 
LUF LV UF NZVC=0000000 


After Instruction: 


RO = 0200056F30h = 4.00066337e+00 
R5 = O2056FE332h = 4.16990814e+00 
LUF LV UF NZV C=0000000 
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LDF 


Syntax 
Operation 


Operands 


Encoding 
31 


oorstofay oe Pe 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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Load Floating-Point 


LDF <src>,<dst> 
src > dst 


src general addressing modes (G): 
00 register (Rn,O <n <7) 
QO1 direct 
10° indirect 
11 immediate 


dst register (Rn,O <n < 7) 


24 23 1615 87 


The sre operand is loaded into the dst register. The dst and sre operands 


are assumed to be floating-point numbers. 


1 


N 1 if a negative result is generated, O otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 

V 0 

Cc Unaffected. 

UF 0O 


LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 
LDF @9800h,R2 


Before Instruction: 


DP = 80h 

R2 = Oh 

Data at 809800h = 10C52A00h = 2.19254303e+00 
LUF LV UF NZV C=0000000 


After Instruction: 


DP = 80h 

R2 = 010C52A000h = 2.19254303e+00 

Data at 809800h = 10C52A00h = 2.19254303e+00 
LUF LV UF NZV C=0000000 


Load Floating-Point Conditionally LDFcond 


Syntax 


Operation 


Operands 


Encoding 
31 


|_DFcond <src>,<dst> 


lf cond is true: 
src > dst. 
Else: 
dst is unchanged. 


src general addressing modes (G): 
0 O register (Rn, O <n < 7) 
0 1 direct 
1 O indirect 
1 1 immediate 


dst register (Rn,O <n ¢s 7) 


24 23 1615 87 0 


j_eont [ote Pe 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


If the condition is true, the src operand is loaded into the dst register. 
Otherwise, the dst register is unchanged. The dst and src operands are 
assumed to be floating-point numbers. 


The TMS320C30 provides 20 condition codes that can be used with this 
instruction (see Section 11.2 for a list of condition mnemonics, encoding, 
and flags). Note that an LDFU (load floating-point unconditionally) in- 
struction is useful for loading RO-R7 without affecting condition flags. 


Unaffected. 
Unaffected. 
Unaffected. 
Unaffected. 
Unaffected. 
Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


- N2Z2 - 
2205 2 


LDFZ R3,R5 


Before Instruction: 


R3 = 2CFF2CD500h = 1.77055560e+13 
R5 = SFOOOOOO3Eh = 3.96140824e+28 
LUF LV UF NZV C=0000100 


After Instruction: 


R3 = 2CFF2CD500h = 1.77055560e+13 
R5 = 2CFF2CD500h = 1.77055560e+13 
LUF LV UF NZVC=0000100 
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LDFI Load Floating-Point, Interlocked 


Syntax LDFI <sre>,<dst> 
Operation Signal interlocked operation. 
src > dst 
Operands src general addressing modes (G): 
01. direct 
10 indirect 


dst register (Rn, O <n < 7) 
Encoding 
31 24 23 1615 87 0 


Description — The src operand is loaded into the dst register. An interlocked operation is 
signaled over XFO and XF1. The src and dst operands are assumed to be 
floating-point numbers. Note that only direct and indirect modes are al- 
lowed. Refer to Section 7.3 for detailed description. 


Cycles 1 if XF1 = O (see Section 7.3) 
Status Bits N 1 if a negative result is generated, O otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 0 
Cc Unaffected. 
UF 0O 


LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example LDFI *+AR2,R7 


Before Instruction: 


AR2 = 8098Fih 

R7 = Oh 

Data at 8098F2h = 584C000h = -6.28125e+01 
LUF LV UF NZVC=000000 0 


After Instruction: 


AR2 = 8098Fih 

R7 = 0584C00000h = -6.28125e+01 

Data at 8098F2h = 584C000h = -6.28125e+01 
LUF LV UF NZVC=00000 0 1 
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Parallel LDF and LDF LDF||LDF 


Syntax LDF <src2>,<dst2> 
|| LDF <sre7>,<dst7> 
Operation src2 > dst2 
|| src7 > dst7 


Operands src7 indirect (disp = 0, 1, IRO, IR1) 
dst7 register (Rn1,0 < n1 < 7) 
src2 indirect (disp = 0, 1, IRO, IR1) 
dst2 register (Rn2,0 < n2 < 7) 


Encoding 
31 24 23 16 15 87 0 


Description Two floating-point loads are performed in parallel. If the LDFs load the 
same register, the assembler issues a warning. The result is that of LDF 
<src2>,<dst2>. 


Cycles 1 

Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 


LUF Unaffected. 
Mode Bit OVM Operation not affected by OVM. 


Example LDF *--AR1(IRO),R7 
|| LDF *AR7++(1),R3 


Before Instruction: 


AR1 = 80985Fh 

IRO = 8h 

R7 = Oh 

AR7 = 80988Ah 

R3 = Oh 

Data at 809857h = 70C8000h = 1.4050e+02 
Data at 80988Ah = 57B4000h = 6.281250e+01 
LUF LV UF NZVC=0000000 


After Instruction: 


AR1 = 809857h 

IRO = 8h 

R7 = 070C800000h = 1.4050e+02 

AR7 = 80988Bh 

R3 = 057B400000h = 6.281250e+01 

Data at 809857h = 7O0C8000h = 1.4050e+02 
Data at 80988Ah = 57B4000h = 6.281250e+01 
LUF LV UF NZVC=0 000000 
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LDF||STF 


Syntax 
Operation 


Operands 


Encoding 
31 


Parallel LDF and STF 


LDF <src2>,<dst1> 
| STF <src3>,<dst2> 


src2 — dst? 
| sre3 > dst2 


src2 indirect (disp = O, 1, IRO, IR1) 
dst? register (Rn1,0 <n1 < 7) 
src3 register (Rn2,0 < n2 < 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


24 23 1615 


87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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A floating-point load and a floating-point store are performed in parallel. 


If src2 and dst2 point to the same location, src2 is read before the write to 


dst2. 

1 

N Unaffected. 
Zz Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 


LUF Unaffected. 
OVM Operation not affected by OVM. 


LDF *AR2--(1),R1 
|| STF R3,*AR4++(IR1) 


Before Instruction: 


AR2 = 8098E7h 

R1 = Oh 

R3 = 057B400000h = 6.28125e+01 

AR4 = 809900h 

IR1 = 10h 

Data at 8098E7h = 70C8000h = 1.4050e+02 
Data at 809900h = Oh 

LUF LV UF NZVC=0000000 


Parallel LDF and STF LDF||STF 


After Instruction: 


AR2 = 8098E6h 

R1 = 070C800000h = 1.4050e+02 

R3 = 057B400000h = 6.28125e+01 

AR4 = 809910h 

IR1 = 10h 

Data at 8098E7h = 70C8000h = 1.4050e+02 
Data at 809900h = 57B4000h = 6.28125e+01 
LUF LV UF NZVC=0000000 
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LDI Load Integer 


Syntax LDI <sre>,<dst> 

Operation src > dst 

Operands src general addressing modes (G): 
00 register (Rn,O <n < 27) 
01 direct 
10. indirect 


11 immediate 
dst register (Rn, 0 <n <s 27) 
Encoding 
31 24 23 1615 87 | 0 


Description — The src operand is loaded into the dst register. The dst and src operands 
are assumed to be signed integers. An alternate form of LDI, LDP, is used 
to load the data page pointer register (DP), or any other register with the 
eight MSBs of a relocatable address. See Section 11.3.2. 


Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 0 
Cc Unaffected. 
UF 0 


LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example LDI *-AR1(IRO),R5 

Before instruction: 
AR1 = 2Ch 
IRO = 5h 
R5 = 3C5h = 965 

11 Data at 27h = 26h = 38 
LUF LV UF NZVC=0 000000 
After Instruction: 
AR1 = 2Ch 
IRO = 5h 
R5 = 26h = 38 


Data at 27h = 26h = 38 
LUF LV UF NZV C=0 000000 
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Load Integer Conditionally LDicond 


Syntax 


Operation 


Operands 


Encoding 
31 


LDlcond <src>,<dst> 


If cond is true: 
src > dst, 
Else: 
dst is unchanged. 


sre general addressing modes (G): 
0 O register (Rn, O < n < 27) 
O01 direct 
1 O indirect 
1 1 Immediate 


dst register (Rn, 0 <n < 27) 


24 23 1615 87 0 


| cont foto [oe 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


If the condition is true, the src operand is loaded into the dst register. 
Otherwise, the dst register is unchanged. The dst and src operands are as- 
sumed to be signed integers. 


The TMS320C30 provides 20 condition codes that can be used with this 
instruction (see Section 11.2 for a list of condition mnemonics, encoding, 
and flags). Note that an LDIU (load integer unconditionally) instruction is 
useful for loading RO-R7 without affecting the condition flags. 


1 


N Unaffected. 
Zz Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


LDIZ R4,R6 


Before Instruction: 


R4 = 027Ch = 636 
R6 = OFE2h = 4,066 
LUF LV UF NZV C=0000000 


After Instruction: 


R4 = 027Ch = 636 
R6 = OFE2h = 4,066 
LUF LV UF NZV C=0000000 
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LDII Load Integer, Interlocked 


Syntax LDIl <sre>,<dst> 
Operation Signal interlocked operation. 
src > dst 
Operands sre general addressing modes (G): 
01 direct 
1 O indirect 


dst register (Rn, O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description — The src operand is loaded into the dst register. An interlocked operation is 
signaled over XFO and XF1. The src and dst operands are assumed to be 
signed integers. Note that only the direct and indirect modes are allowed. 
Refer to Section 7.3 for detailed description. 


Cycles 1 if XF = O (see Section 7.3) 
Status Bits N 1 if a negative result is generated,O otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 0 
Cc Unaffected. 
UF 0O 


LV Unaffected 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example LDII @985Fh,R3 
Before Instruction: 
DP = 80 
R3 = Oh 
Data at 80985Fh = ODCh 
LUF LV UF NZV C=90 000000 
After Instruction: 
DP = 80 
R3 = ODCh 


Data at 80985Fh = ODCh 
LUF LV UF NZV C=0 000000 
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Parallel LDI and LDI LDI||LDI 


Syntax LDI <srce2>,<dst2> 
|| LDI_ <srce7>,<dst7> 
Operation src2 -> dst2 
|| sre7 > dst7 


Operands src? indirect (disp = O, 1, IRO, IR1) 
dst7 register (Rn1,0 < n1 < 7) 
src2 indirect (disp = O, 1, IRO, IR1) 
dst2 register (Rn2,0 < n2 < 7) 


Encoding 
31 24 23 1615 87 0 


afew Ta] ae [at foo] wee 


Description Two integer loads are performed in parallel. A warning is issued by the 
assembler if the LDIs load the same register. The result is that of LDI 
<src2>,<dst2>. 


Cycles 1 
Status Bits N Unaffected. 
Zz Unaffected. 
V Unaffected. 
Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 
Mode Bit OVM Operation not affected by OVM. 
Example LDI *-AR1(1),R7 


|| LDI *AR7++(IRO),R1 


Before Instruction: 


AR1 = 809826h 

R7 = Oh 

AR7 = 8098C8h 

IRO = 10h 

R1 = Oh 

Data at 809825h = OFAh = 250 

Data at 8098C8h = 2EEh = 750 

LUF LV UF NZV C=0 000000 


After Instruction: 


AR1 = 809826h 

R7 = OFAh = 250 

AR7 = 8098D8h 

IRO = 10h 

R1 = O2EEh = 750 

Data at 809825h = OFAh = 250 

Data at 8098C8h = 2EEh = 750 

LUF LV UF NZV C=0 000000 
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LDI||STI 


Syntax 
Operation 


Operands 


Encoding 
31 


Parallel LDI and STI 


LDI <src2>,<dst7> 
I| STI <sre3>,<dst2> 


src2 — dsti 
|| srce3 - dst2 


src2 indirect (disp = O, 1, IRO, IR1) 
dsti register (Rn1,0 < n1 < 7) 
src3 register (Rn2,0 < n2 <7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


24 23 1615 87 0 


aero at [ooo ot [wee 


Description 


Cycles 
Status Bits 


Mode Bit 
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An integer load and an integer store are performed in parallel. 


If src2 and dst2 point to the same location, src2 is read before the write to 
dst2. 


1 

N Unaffected. 
Z Unaffected. 
V Unaffected. 
Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 


LUF Unaffected. 
OVM Operation not affected by OVM. 


Parallel LDI and STI LDI||STI 


Example LDI *-AR1(1),R2 
|| STI R7,*AR5++(IRO) 


Before Instruction: 
AR1 = 8098E7h 


R2 = Oh 

R7 = 35h = 53 
AR5 = 80982Ch 
IRO = 8h 


Data at 8098E6h = ODCh = 220 
Data at 80982Ch = Oh 
LUF LV UF NZ2VC=0 000000 


After Instruction: 


AR1 = 8098E7h 
R2 = ODCh = 220 


R7 = 35h = 53 
AR5 = 809834h 
IRO = 8h 


Data at 8098E6h = ODCh = 220 
Data at 80982Ch = 35h = 53 
LUF LV UF NZV C=0000000 
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LDM Load Floating-Point Mantissa 


Syntax LDM <srce>,<dst> 

Operation src(man) ~ dst(man) 

Operands src general addressing modes (G): 
00 register (Rn,O <n < 7) 
01. direct 
10 indirect 


11 immediate 
dst register (Rn,O <n < 7) 
Encoding 
31 24 23 1615 87 0 


Description The mantissa field of the src operand is loaded into the mantissa field of the 
dst register. The dst exponent field is not modified. The sre and dst op- 
erands are assumed to be floating-point numbers. If immediate addressing 
mode is used, bits 15 - 12 of the instruction word are forced to 0 by the 


assembler. 
Cycles 1 
Status Bits N Unaffected. 
: Z Unaffected. 


V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
Example LDM 156.75,R2 (156.75 = 071CCOQ000h) 


Before Instruction: 

R2 = Oh 

LUF LV UF NZVC=0 000000 
After Instruction: 


R2 = 001CC00000h = 1.22460938e+00 
LUF LV UFNZVC=0000000 
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Logical Shift LSH 


Syntax 


Operation 


Operands 


Encoding 
31 


LSH <count>,<dst> 


If count > 0: 

dst << count > dst 
Else: 

dst >> |count| > dst 


src general addressing modes (G): 
00 register (Rn, O <n < 27) 
01. direct 
10 indirect 
11 immediate 


dst register (Rn, O <n < 27) 


24 23 1615 87 0 


FOROS eS 


Description 


Cycles 
Status Bits 


Mode Bit 


The seven feast-significant bits of the count operand are used to generate 
the two’s-complement shift count. If the count operand is greater than zero, 
the dst operand is left- shifted by the value of the count operand. Low- 
order bits shifted in are zero-filled, and high-order bits are shifted out 
through the C (carry) bit. 


Logical left-shift: 

C+ dst = 0 
If the count operand is less than zero, the dst is right-shifted by the absolute 
value of the count operand. The high-order bits of the dst operand are 
zero-filled as shifted to the right. Low-order bits are shifted out through the 
C (carry) bit. 
Logical right-shift: 

0-dst>C 
If the count operand is O, no shift is performed and the C (carry) bit is set 


to 0. The count operand is assumed to be a signed integer and the dst 
operand is assumed to be an unsigned integer. 


1 

N MSB of the output. 

Z 1 if a zero output is generated, O otherwise. 
V 


C Set to the value of the last bit shifted out. O for a shift count of 0. 
Unaffected if dst is not RO-R7. 

UF 0 

LV Unaffected. 

LUF Unaffected. 


OVM Operation not affected by OVM. 
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LSH Logical Shift 


Example LSH R4,R7 


Before Instruction: 

R4 = 018h = 24 

R7 = 02ACh 

LUF LV UF NZ V C=0 000000 


After instruction: 


R4 = 018h = 24 
R7 = OACOO00000h 
LUF LV UF NZVC=000101 0 


Example LSH *-AR5(IR1),R5 


Before Instruction: 


AR5 = 809908h 

IRO = 4h 

R5 = 0012C00000h 

Data at 809904h = OFFFFFFF4h = -12 

LUF LV UF NZVC=0000000 


After Instruction: 


AR5 = 809908h 

IRO = 4h 

R5 = 000001 2CO0Oh 

Data at 809904h = OFFFFFFF4h = -12 

LUF LV UF NZVC=0000000 
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Logical Shift, 3-Operand LSH3 


Syntax 


Operation 


Operands 


Encoding 
31 


LSH3 <count>,<src>,<dst> 


If count > 0: 

src << count > dst 
Else: 

src >> |count| > dst 


src. three-operand addressing modes (T): 
QO register (Rn1,0 <n < 27) 
01 indirect (disp = O, 1, IRO, IR1) 
10 register (Rn1,0 < n1 < 26) 
11 indirect (disp = O, 1, IRO, IR1) 


count three-operand addressing modes (T): 
register (Rn2,0 < n2 < 27) 

1 register (Rn2,0 < n2 < 27) 

O indirect (disp = O, 1, IRO, IR1) 

11 indirect (disp = O, 1, IRO, IR1) 


dst register (Rn,O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 


The seven least-significant bits of the count operand are used to generate 
the two’s-complement shift count. 


If the count operand is greater than zero, the dst operand is left-shifted by 
the value of the count operand. Low-order bits shifted in are zero-filled, 
and high-order bits are shifted out through the C (carry) bit. 


Logical left-shift: 
C< sre 0 


If the count operand is less than zero, the src operand is right-shifted by the 
absolute value of the count operand. The high-order bits of the dst operand 
are zero-filled as shifted to the right. Low-order bits are shifted out through 
the C (carry) bit. 


Logical right-shift: 


Q0O- src >C 


lf the count operand is O, no shift is performed and the C (carry) bit is set 
to 0. The count operand is assumed to be a signed integer. The sre and 
dst operands are assumed to be unsigned integers. 


1 
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LSH3 Logical Shift, 3-Operand 


Status Bits N MSB of the output. 

Z 1 if a zero output is generated, 0 otherwise. 

V 

C Set to the value of the last bit shifted out. O for a shift count of 0. 
Unaffected if dst is not RO-R7. 

UF 0 

LV Unaffected. 

LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example LSH3 R4,R7,R2 


Before Instruction: 


R4 = 018h = 24 

R7 = O2ACh 

R2 = Oh 

LUF LV UF NZVC=0000000 


After Instruction: 


R4 = 018h = 24 

R7 = O2ACh 

R2 = 0OACOOQ0000h 

LUF LV UF NZVC=000101 0 


Example LSH3 *-AR4(IR1)R5,R3 


Before Instruction: 
AR4 = 809908h 


IR1 = 4h 
R5 = 0712C00000h 
R3 = Oh 


Data at 809904h = OFFFFFFF4h = -12 
LUF LV UF NZVC=0000000 


After Instruction: 


AR4 = 809908h 

IR1 = 4h 

R5 = 012C00000h 

R3 = 000001 2CO00Oh 

Data at 809904h = OFFFFFFF4h = -12 

LUF LV UF NZ V C=0 000000 


11-94 


Parallel LSH3 and STI LSH3|/STI 


Syntax 


Operation 


Operands 


Encoding 
31 


LSH3 <count>,<src2>,<dst1> 
|| STL <sre3>,<dst2> 


If count > 0: 

src2 << count > dst? 
Else: 

src2 >> |count| > dst7 
|| src3 > dst2 


count register (Rn1,0 < n1 < 7) 
src7 indirect (disp = O, 1, IRO, IR1) 
dst7 register (Rn3,0 < n3 < 7) 
src2 register (Rn4,0 < n4 <7) 
dst2 indirect (disp = O, 1, IRO, IR1) 


IA IA 


24 23 1615 87 0 


afer ey ol dee [eset oot | we [ee 


Description 


Cycles 


The seven least-significant bits of the count operand are used to generate 
the two’s-complement shift count. 


If the count operand is greater than zero, the dst operand is left-shifted by 
the value of the count operand. Low- order bits shifted in are zero-filled 
and high-order bits are shifted out through the C (carry) bit. 


Logical left-shift: 

C + dst2 <- 0 
If the count operand is less than zero, the dst operand is right-shifted by the 
absolute value of the count operand. The high-order bits of the dst operand 
are zero filled as shifted to the right. Low-order bits are shifted out through 
the C (carry bit). 
Logical right-shift: 

0O-dst2 -C 
If the count operand is O, no shift is performed and the carry bit is set to 0. 


The count operand is assumed to be a 7-bit signed integer and the src2 and 
dst? operands are assumed to be unsigned integers. All registers are read 
at the beginning and loaded at the end of the execute cycle. This means 
that if one of the parallel operations (STI) reads from a register and the 
operation being performed in parallel (LSH3) writes to the same register, 
then STI accepts as input the contents of the register before it is modified 
by the LSH3. 


If src2 and dst2 point to the same location, src2 is read before the write to 
dst2. 


1 
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LSH3||STI 


Status Bits 


Mode Bit 


Example 
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Parallel LSH3 and STI 


N MSB of the output. 

Zz 1 if a zero output is generated, O otherwise. 

V 0 

Cc Set to the value of the last bit shifted out. O for a shift count of O. 
UF 0 

LV Unaffected. 


LUF Unaffected. 
OVM Operation not affected by OVM. 


LSH3 R2,*++AR3(1),RO 
|| STI R4,*-AR3 


Before Instruction: 


R2 = 18h = 24 
AR3 = 8098C2h 
RO = Oh 


R4 = ODCh = 220 

AR3 = 8098A3h 

Data at 8098C3h = OACh 

Data at 8098A2h = Oh 

LUF LV UF NZV C=0 000000 


After Instruction: 


R2 = 18h = 24 

AR3 = 8098C3h 

RO = OACOQOO0000h 

R4 = ODCh = 220 

AR3 = 8098A3h 

Data at 8098C3h = OACh 

Data at 8098A2h = ODCh = 220 

LUF LV UF NZVC=0001000 


Parallel LSH3 and STI LSH3|/|STI 


Example LSH3 R7,*AR2--(1),R2 
|| STI RO,*+ARO(1) 


Before Instruction: 


R7 = OFFFFFFF4h = -12 

AR2 = 809863h 

R2 = Oh 

RO = 12Ch = 300 

ARO = 8098B7h 

Data at 809863h = 2COO00000h 

Data at 8098B8h = Oh 

LUF LV UF NZV C=0 000000 


After Instruction: 


R7 = OFFFFFFF4h = -12 

AR2 = 809862h 

R2 = 2C000h 

RO = 12Ch = 300 

ARO = 8098B7h 

Data at 809863h = 2CO00000h 

Data at 8098B8h = 12Ch = 300 

LUF LV UF NZVC=0000000 
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M PYF Multipy Floating-Point 


Syntax MPYF <sre>,<dst> 

Operation dst x src > dst 

Operands src general addressing modes (G): 
00 register (Rn,O <n < 7) 
01 direct 
10. indirect 


11 immediate 
dst register (Rn,O <n < 7) 
Encoding 
31 24 23 16 15 87 0 


Description — The product of the dst and src operands is loaded into the dst register. The 
src operand is assumed to be a single-precision floating-point number, and 
the dst operand is an extended-precision floating-point number. 


Cycles 1 
Status Bits N 1 if a negative result is generated, 0 otherwise. 
Z 1 if a zero result is generated, O otherwise. 
V 1 if a floating-point is overflow occurs, 0 otherwise. 


Cc Unaffected. 

UF 1 if a floating-point underflow occurs, 0 otherwise. 

LV 1 if a floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 


Example MPYF RO,R2. 


Before Instruction: 


RO = 070C800000h = 1.4050e+02 
R2 = 034C200000h = 1.27578125e+01 
LUF LV UF NZVC=0000000 


After Instruction: 
RO = 070C800000h = 1.4050e+02 
R2 = OA600F2000h = 1.79247266e+03 
LUF LV UF NZV C=0 000000 


11-98 


Multiply Floating-Point, 3-Operand MPYF3 


Syntax 
Operation 


Operands 


Encoding 
31 


MPYF3 <src2>,<src1>,<dst> 
src? x src2 > dst 


src7 three-operand addressing modes (T): 
00 register (Rn1,0 < n1 < 7) 
01 indirect (disp = 0, 1, IRO, IR1) 
10 register (Rn1,0 <n1 < 7) 
11 indirect (disp = 0, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2,0 < n2 < 7) 
O01 register (Rn2,0 < n2 < 7) 
1Q indirect (disp = 0, 1, IRO, IR1) 
11 indirect (disp = 0, 1, IRO, IR1) 


dst register (Rn, O <n < 7) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


The product of the dst7 and src2 operands is loaded into the dst register. 
The src7 and src2 operands are assumed to be single-precision floating- 
point numbers, and the dst operand is an extended-precision floating-point 
number. 


N 1 if a negative result is generated, O otherwise. 

Z 1 if a zero result is generated, 0 otherwise. 

V 1 if a floating-point overflow occurs, 0 otherwise. 

C Unaffected. 

UF $1 if a floating-point underflow occurs, 0 otherwise. 

LV $1 if a floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


OVM Operation not affected by OVM. 
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MPYF3 Multiply Floating-Point, 3-Operand 


Example MPYF3 RO,R7,R1 


Before Instruction: 


RO = 057B400000h = 6.281250e+01 

R7 = 0733C00000h = 1.79750e+02 

R1 = Oh 

LUF LV UFNZVC=0000000 


After Instruction: 


RO = 057B400000h = 6.281250e+01 

R7 = 0733C00000h = 1.79750e+02 

R1 = OD306A3000h = 1.12905469e+04 
LUF LV UF NZV C=0000000 


Example MPYF3 *+AR2(IRO),R7,R2 
or 
MPYF3 R7,*+AR2(IRO),R2 


Before Instruction: 


AR2 = 809800h 

IRO = 12Ah 

R7 = 057B400000h = 6.281250e+01 

R2 = Oh 

Data at 80992Ah = 70C8000h = 1.4050e+02 
LUF LV UF NZVC=0000000 


After Instruction: 


AR2 = 809800h 

IRO = 12Ah 

R7 = 057B400000h = 6.281250e+01 

R2 = ODOSE4A000h = 8.82515625e+03 
Data at 80992Ah = 70C8000h = 1.4050e+02 
LUF LV UFNZVC=0000000 
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Parallel MPYF3 and ADDF3 MPYF3||ADDF3 


Syntax 


Operation 


Operands 


Encoding 


MPYF3 <srcA>,<srcB>,<dst1> 
l| ADDF3 <srceC>,<srcD>,<dst2> 


srcA x srcB > dsti7 
|| sreC + srcD > dst2 


srcA 

srcB | Any two indirect (disp = 0,1,1R0,IR1) 
srcC | Any two register (0 < ARn < 7) 
srcD 


dst7 register (d7): 
0 = RO 


1 = R1 
dst2 register (d2): 
0 = R2 
1=R3 
src7 register (Rn,O <n < 7) 
src2 register (Rn,O <n < 7) 
src3 indirect (disp = O, 71, IRO, IR1) 
src4 indirect (disp = O, 1, IRO, IR1) 
P parallel addressing modes (0 <P < 3) 
OPERATION 
00 sre3 x src4, src? + src2 
01 src3 x sre7, src4 + src2 
10 sre? x sre2, src3 + src4 
11 srce3 x srce7l, src2 + src4 
24 23 1615 87 0 


epee? bet oe [Se 


Description 


A floating-point multiplication and a floating-point addition are performed 
in parallel. All registers are read at the beginning and loaded at the end of 
the execute cycle. This means that if one of the parallel operations (MPYF3) 
reads from a register and the operation being performed in_ parallel 
(ADDF3) writes to the same register, then MPYF3 accepts as input the 
contents of the register before it is modified by the ADDF3. 


Any combination of addressing modes may be coded for the four possible 
source operands as long as the two are coded as indirect and two are reg- 
ister. The assignment of the source operands srcA-srcD to the src1-src4 
fields varies depending on the combination of addressing modes used, and 
the P field is encoded accordingly. The assembler may, when not signif- 
icant, change the order of operands in commutative operations, in order to 
simplify processing. 
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MPYF3||ADDF3 Parallel MPYF3 and ADDF3 


Cycles 
Status Bits 


Mode Bit 


Example 
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If src2 and dst2 point to the same location, src2 is read before the write to 


N 
Z 
V 1 if a floating-point overflow occurs, O otherwise. 
Cc Unaffected. 

UF 1 if a floating-point underflow occurs, 0 otherwise. 

LV 1 if a floating-point overflow occurs, 0 unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, 0 unchanged otherwise. 


OVM Operation not affected by OVM. 


MPYF3 *AR5++(1),*--AR1(IRO) ,RO 
|| ADDF3 R5,R7,R3 


Before Instruction: 


AR5 = 8098C5h 

AR1 = 8098A8h 

IRO = 4h 

RO = Oh 

R5 = 0733C00000h = 1.79750e+02 

R7 = 070C800000h = 1.4050e+02 

R3 = Oh 

Data at 8098C5h = 34C0000h = 1.2750e+01 
Data at 8098A4h = 1110000h = 2.2500e+00 
LUF LV UF NZV C=0000000 


After Instruction: 


AR5 = 8098CE6h 

AR1 = 8098A4h 

IRO = 4h 

RO = 0467180000h = 2.88867188e+01 

R5 = 0733C00000h = 1.79750e+02 

R7 = 070C800000h = 1.4050e+02 

R3 = 0820200000h = 3.20250e+02 

Data at 8098C5h = 34C0000h = 1.2750e+01 
Data at 8098A4h = 1110000h = 2.2500e+00 
LUF LV UF NZVC=0000000 


Parallel MPYF3 and STF MPYF3||STF 


Syntax 


Operation 


Operands 


Encoding 
31 


MPYF3 <src2>,<src1>,<dst1> 
|| STF <srce3>,<dst2> 


src? x src2—> dst1 
[| sre3 > dst2 


srce7_ register (Rn1,0 < n1 < 7) 
src2 indirect (disp = 0, 1, IRO, IR1) 
dst7 register (Rn3, 0 < n3 < 7) 
src3 register (Rn4,0 < n4 <7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


24 23 1615 87 0 


afer ee a] at Dot [ot [wee 


Description 


Cycles 
Status Bits 


Mode Bit 


A floating-point multiplication and a floating-point store are performed in 
parallel. All registers are read at the beginning and loaded at the end of the 
execute cycle. This means that if one of the parallel operations (MPYF3) 
reads from a register and the operation being performed in parallel (STF) 
writes to the same register, then MPYF3 accepts as input the contents of 
the register before it is modified by the STF. 


If src2 and dst2 point to the same location, then src2 is read before the write 
to dst2. 


1 


N 1 if a negative result is generated, O otherwise. 

Z 1 if a zero result is generated, O otherwise. 

V 1 if a floating-point overflow occurs, 0 otherwise. 

Cc Unaffected. 

UF 1 if a floating-point underflow occurs, 0 otherwise. 

LV 1 if a floating-point overflow occurs, 0 unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, 0 unchanged otherwise. 


OVM Operation not affected by OVM. 
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MPYF3||STF Parallel MPYF3 and STF 


Example MPYF3 *-AR2(1),R7,RO 
|| STF R3,*ARO--(IRO) 


Before Instruction: 


AR2 = 80982Bh 

R7 = 057B400000h = 6.281250e+01 

RO = Oh 

R3 = 086B280000h = 4.7031250e+02 

ARO = 809860h 

IRO = 8h 

Data at 80982Ah = 7OC8000h = 1.4050e+02 
Data at 809860h = Oh 

LUF LV UF NZV C=0 000000 


After Instruction: 


AR2 = 80982Bh 

R7 = 057B400000h = 6.281250e+01 

RO = ODO9YE4A000h = 8.82515625e+03 

R3 = 086B280000h = 4.7031250e+02 

ARO = 809858h 

IRO = 8h 

Data at 80982Ah = 70C8000h = 1.4050e+02 

Data at 809860h = 86B280000h = 4.7031250e+02 
LUF LV UF NZVC=0000000 
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Parallel MPYF3 and SUBF3 MPYF3||SUBF3 


Syntax MPYF3 <srcA>,<srcB>,<dst1> 
|| SUBF3 <srceC>,<srcD>,<dst2> 
Operation srcA x srcB > dst7 


|| srecD - srcC > dst2 


Operands 
SIcA 
srcB | Any two indirect (disp = 0,1,1RO,IR1) 
srcC | Any two register (0 < ARn < 7) 
srcD 
dst? register (d7): 
0 = RO 
1 = R1 
dst2 register (d2): 
0 = R2 
1=R3 
src? register (Rn,O <n <7) 
src2 register (Rn,O <n <7) 
sre3 indirect (disp = O, 1, IRO, IR1) 
src4 indirect (disp = O, 1, IRO, IR1) 
P parallel addressing modes (0 < P < 3) 
OPERATION 
00 src3 x sre4, src? - src2 
01 src3 x src1, src4 - src2 
10 src? x src2, src3 - src4 
11 src3 x src1, src2 - src4 
Encoding 
31 24 23 1615 87 0 


Oo Eo Ee eee ee ee 
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MPYF3||SUBF3 Parallel MPYF3 and SUBF3 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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A floating-point multiplication and a floating-point subtraction are per- 
formed in parallel. All registers are read at the beginning and loaded at the 
end of the execute cycle. This means that if one of the parallel operations 
(MPYF3) reads from a register, and the operation being performed in par- 
allel (SUBF3) writes to the same register, then MPYF3 accepts as input the 
contents of the register before it is modified by the SUBF3. 


Any combination of addressing modes may be coded for the four possible 
source operands as long as the two are coded as indirect and two are reg- 
ister. The assignment of the source operands srcA-srcD to the src7-src4 
fields varies depending on the combination of addressing modes used, and 
the P field is encoded accordingly. The assembler may, when not signif- 
icant, change the order of operands in commutative operations, in order to 
simplify processing. 


N 0 

Z 0 

V 1 if a floating-point overflow occurs, 0 otherwise. 
C Unaffected. 


UF 1 if a floating-point underflow occurs, 0 otherwise. 
LV 1 if a floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


OVM Operation not affected by OVM. 


MPYF3 R5,*++AR7(IR1),RO 
|| SUBF3 R7,*AR3--(1),R2 
or 


MPYF3 R5,*++AR7(IR1),R5,RO 
|| SUBF3 R7,*AR3--(1),R2 


Before Instruction: 


R5 = 034C000000h = 1.2750e+01 

AR7 = 809904h 

IR1 = 8h 

RO = Oh 

R7 = 0733C00000h = 1.79750e+02 

AR3 = 8098B2h 

R2 = Oh 

Data at 80990Ch = 1110000h = 2.250e+00 
Data at 8098B2h = 70C8000h = 1.4050e+02 
LUF LV UF NZVC=0000000 


Parallel MPYF3 and SUBF3 MPYF3||SUBF3 


After Instruction: 


R5 = 034C000000h = 1.2750e+01 

AR7 = 80990Ch 

IR1 = 8h 

RO = 0467180000h = 2.88867188e+01 

R7 = 0733C00000h = 1.79750e+02 

AR3 = 8098Bih 

R2 = 05E3000000h = -3.9250e+01 

Data at 80990Ch = 1110000h = 2.250e+00 
Data at 8098B2h = 70C8000h = 1.4050e+02 
LUF LV UF NZVC=0000000 
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MPYI 


Syntax 
Operation 


Operands 


oe 


Multiply Integer 


MPYI <src>,<dst> 
dst x src > dst 


src general addressing modes (G): 
0O register (Rn,O<n< 27) 
01. direct 
10. indirect 
11 immediate 


dst register (Rn,O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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The product of the dst and src operands is loaded into the dst register. The 
src and dst operands when read are assumed to be 24-bit signed integers. 
The result is assumed to be a 48-bit signed integer. The output to the dst 
register is the 32 least-significant bits of the result. 


Integer overflow occurs when any of the most-significant 16 bits of the 
48-bit result differs from the most-significant bit of the 32-bit output value. 


N 1 if a negative result is generated, O otherwise. 

Z 1 if a zero result is generated, O otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. 

C Unaffected. 

UF 0 

LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unchanged. 


OVM Operation affected by OVM. 


MPYI R1,R5 


Before Instruction: 


R1 = 000033C251h = 3,392,081 
R5 = 000078B600h = 7,910,912 
LUF LV UF NZVC=0000000 


After instruction: 


R1 = 000033C251h = 3,392,081 
R5 = 00E21D9600h = -501,377,536 
LUF LV UF NZVC=010101 0 


Multiply Integer, 3-Operand MPYI3 


Syntax MPYI3 <src2>,<src1>,<dst> 
Operation src? x src2 > dst 
Operands src? three-operand addressing modes (T): 


OQ register (Rn1, < n1 < 27) 
01 indirect (disp = 0, 1, IRO, IR1) 
10 register (Rni, < n1 < 27) 
11 indirect (disp = O, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2, < n2 < 27) 
O01 register (Rn2, < n2 < 27) 
1Q indirect (disp = 0, 1, IRO, IR1) 
11. indirect (disp = 0, 1, IRO, IR1) 


dst register (Rn, O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description — The product of the src7 and src2 operands is loaded into the dst register. 
The src7 and sre2 operands are assumed to be 24-bit signed integers. The 
result is assumed to be a signed 48-bit integer. The output to the dst reg- 
ister is the 32 teast-significant bits of the result. 


Integer overflow occurs when any of the most-significant 16 bits of the 
48-bit result differs from the most-significant bit of the 32-bit output value. 


Cycles 1 


Status Bits N 1 if a negative result is generated, O otherwise. 

Z 1 if a zero result is generated, O otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. 

C Unaffected. 

UF 0 

LV 1 if an integer overflow occurs, unchanged otherwise. 


LUF Unchanged. 
Mode Bit OVM Operation affected by OVM. 
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MPYI3 Multiply Integer, 3-Operand 


Example MPYI3 *AR4,*-AR1(1),R2 


Before Instruction: 


AR4 = 809850h 

AR1 = 8098F3h 

R2 = Oh 

Data at 809850h = OADh = 173 

Data at 8098F2h = ODCh = 220 

LUF LV UF NZV C=0 000000 


After Instruction: 


AR4 = 809850h 

AR1 = 8098F3h 

R2 = 094ACh = 38,060 

Data at 809850h = OADh = 173 

Data at 8098F2h = ODCh = 220 

LUF LV UF NZV C=0 000000 


Example MPYI3 *--AR4(IRO) ,R2,R7 


Before Instruction: 
AR4 = 8099F8h 


IRO = 8h 
R2 = OC8h = 200 
R7 = Oh 


Data at 8099FOh = 32h 
LUF LV UF NZVC 


50 
0000000 


After Instruction: 


AR4 = 8099FOh 

IRO = 8h 

R2 = 0C8h = 200 

R7 = 02710h = 10,000 

Data at 8099FOh = 32h = 50 

LUF LV UF NZV C=0 000000 
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Parallel MPYI3 and ADDI3 


Syntax MPYI3 <srcA>,<srcB>,<dst7> 
| ADDI3 <sreC>,<srcD>,<dst2> 
Operation srcA x srcB —> dsti 


|| srecD + srcC > dst2 


Operands 


srcA 
srcB 
srcC 
srcD 


dst7 


dst2 


ee 


24 23 


Any two indirect (disp = 0,1,]R0,IR1) 


Any two register (0 < ARn < 7) 


1 = R1 


register (d7): 
O = RO 


register (d2): 


0 = R2 
1 = R3 


register 
register 
indirect 
indirect 


parallel addressing modes (0 < P < 3) 


(Rn, 
(disp = 0, 1, IRO, IR1) 
(disp = 0, 1, IRO, IR1) 


OPERATION 


00 
01 
10 
11 


srce3 x srce4, src? + src2 
sre3 x sre71, src4 + src2 
src? x src2, src3 + src4 
src3 x src1, src2 + src4 


1615 


MPYI3||ADDI3 


87 0 
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MPYI3||ADDI3 Parallel MPYI3 and ADDI3 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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An integer multiplication and an integer addition are performed in parallel. 
All registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (MPYI3) reads from 
a register and the operation being performed in parallel (ADDI3) writes to 
the same register, then MPYI3 accepts as input the contents of the register 
before it is modified by the ADDIS. 


Any combination of addressing modes may be coded for the four possible 
source operands as long as the two are coded as indirect and two are reg- 
ister. The assignment of the source operands srcA-srcD to the src7-src4 
fields varies depending on the combination of addressing modes used, and 
the P field is encoded accordingly. The assembler may, when not signif- 
icant, change the order of operands in commutative operations, in order to 
simplify processing. 


1 


N 0 

Z 0 

V 1 if an integer overflow occurs, 0 otherwise. 
C Unaffected. 

UF 0 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unchanged. 


OVM Operation affected by OVM. 


MPYI3 R7,R4,RO 
|| ADDI3 *-AR3,*AR5--(1),R3 


Before Instruction: 


R7 = 14h = 20 
R4 = 64h = 100 
RO = Oh 


AR3 = 80981Fh 

AR5 = 80996Eh 

R3 = Oh 

Data at 80981Eh = OFFFFFFCBh = -53 
Data at 80996Eh = 35h = 53 

LUF LV UF NZV C=0 000000 


After Instruction: 


R7 = 14h = 20 

R4 = 64h = 100 

RO = 07D0h = 2000 

AR3 = 80981Fh 

AR5 = 80996Dh 

R3 = Oh 

Data at 80981Eh = OFFFFFFCBh = -53 
Data at 80996Eh = 35h = 53 

LUF LV UF NZV C=000000 0 


Parallel MPYI3 and STI3 MPYI3||STI 


Syntax MPYI3 <src2>,<src1>,<dst1> 
| STI <sre3>,<dst2> 
Operation src? x src2 > dst? 
\| src3 > dst2 
Operands src7 register (Rn1i,0 < n1 < 7) 


src2 indirect (disp = 0, 1, IRO, IR1) 
dst7 register (Rn3, 0 < n3 < 7) 
src3 register (Rn4, 0 < n4 < 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


Encoding 
31 24 23 1615 87 0 


vaft ooo of eet | orot | oros | gent ee 


Description — An integer multiplication and an integer store are performed in parallel. All 
registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STI) reads from a 
register and the operation being performed in parallel (MPYI3) writes to the 
same register, then STI accepts as input the contents of the register before 
it is modified by the MPYI3. 


If src2 and dst2 point to the same location, src2 is read before the write to 
dst2. 


Integer overflow occurs when any of the most-significant 16 bits of the 
48-bit result differs from the most-significant bit of the 32-bit output value. 


Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Zz 1 if a zero result is generated, O otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
C Unaffected. 
UF 0O 


LV —1+=‘if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 
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MPYI3||STI Parallel MPYI3 and STI3 


Example MPYI3 *++ARO(1),R5,R7 
|| STI R2,*-AR3(1) 


Before Instruction: 


ARO = 80995Ah 

R5 = 32h = 50 

R7 = Oh 

R2 = ODCh = 220 

AR3 = 80982Fh 

Data at 80995Bh = OC8h = 200 

Data at 80982Eh = Oh 

LUF LV UFNZVC=0000000 


After Instruction: 


ARO = 80995Bh 

R5 = 32h = 50 

R7 = 2710h = 10000 

R2 = ODCh = 220 

AR3 = 80982Fh 

Data at 80995Bh = OC8h = 200 

Data at 80982Eh = ODCh = 220 

LUF LV UF NZV C=000000 0 
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Parallel MPYI3 and SUBI3 MPYI3||SUBI3 


Syntax 


Operation 


Operands 


Encoding 
31 


Description 


MPYI3 <srcA>,<srcB>,<dst1> 
|| SUBI3 <srcC>,<srcD>,<dst2> 


srcA x srcB > dst 
|| srcD - sreC > dst2 


srcA 

srcB | Any two indirect (disp = 0,1,1RO0,IR1) 
srcC | Any two register (0 < ARn < 7) 
srcD 


dst7 


= 


egister (d7): 
RO 


- Oo 
oe 


R1 


dst2 register (d2): 
0 = R2 


1=R3 
src7 register (Rn,O <n <7) 
src2 register (Rn,O <n <7) 
sre3 indirect (disp = 0, 1, IRO, IR1) 
src4 indirect (disp = 0, 1, IRO, IR1) 
P parallel addressing modes (0 < P < 3) 
OPERATION 
00 sre3 x src4, sre? - src2 
01 sre3 x src1, src4 - src2 
10 sre? x srce2, src3 - src4 
11 sre3 x sre7, src2 — src4 
24 23 1615 87 ¢) 


An integer multiplication and an integer subtraction are performed in par- 
allel. All registers are read at the beginning and loaded at the end of the 
execute cycle. This means that if one of the parallel operations (MPYI3) 
reads from a register and the operation being performed in parallel (SUBI3) 
writes to the same register, then MPYI3 accepts as input the contents of the 
register before it is modified by the SUBIS. 


Any combination of addressing modes may be coded for the four possible 
source operands as long as the two are coded as indirect and two are reg- 
ister. The assignment of the source operands srcA-srcD to the src7-src4 
fields varies depending on the combination of addressing modes used, and 
the P field is encoded accordingly. The assembler may, when not signif- 
icant, change the order of operands in commutative operations, in order to 
simplify processing. 
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MPYI3||SUBI3 Parallel MPYI3 and SUBI3 


Integer overflow occurs when any of the most-significant 16 bits of the 
48-bit result differs from the most-significant bit of the 32-bit output value. 


Cycles 1 

Status Bits N 0 
Z 0 
V 1 if an integer overflow occurs, 0 otherwise. 
Cc Unaffected. 


UF 1 if an integer underflow occurs, 0 otherwise. 
LV 1 ‘tf an integer overflow occurs, unchanged otherwise. 
LUF Unchanged. 


Mode Bit OVM Operation affected by OVM. 


Example MPYI3 R2,*++ARO(1),RO 
|| SUBI3 *AR5--(IR1),R4,R2 
or 


MPYI3 *++ARO(1),R2,RO 
|| SUBI3 *AR5--(IR1),R4,R2 


Before Instruction: 


R2 = 32h = 50 

ARO = 8098E3h 

RO = Oh 

AR5 = 8099FCh 

IR17 = O0Ch 

R4 = 07D0h = 2000 

Data at 8098E4h = 62h = 98 

Data at 8099FCh = 4BOh = 1200 

LUF LV UF NZV C=0 000000 


After Instruction: 


~ R2 = 320h = 800 
ARO = 8098E4h 
RO = 01324h = 4900 
AR5 = 8099F0h 
IR1 = O0Ch 
R4 = 07D0h = 2000 
Data at 8098E4h = 62h = 98 
Data at 8099FCh = 4BOh = 1200 
LUF LV UF NZVC=0000000 
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Negative Integer with Borrow NEGB 


Syntax 
Operation 


Operands 


Encoding 
31 


NEGB <sre>,<dst> 
0-sre-C- dst 


src general addressing modes (G): 
00 register (Rn,O <n < 27) 
01. direct 
10 indirect 
11 immediate 


dst register (Rn, O <n < 27) 


24 23 1615 87 0 


orvorsofa] a (oe 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


The difference of the 0, src, and C operands is loaded into the dst register. 
The dst and src are assumed to be signed integers. 


1 


N 1 if a negative result is generated, 0 otherwise. 
Z 1 if a zero result is generated, O otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. 

Cc 1 tf a borrow occurs, 0 otherwise. 

UF 0O 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


OVM Operation affected by OVM. 


NEGB R5,R7 


Before Instruction: 


R5 = OFFFFFFCBh = -53 
R7 = Oh 
LUF LV UF NZV C=00000 0 1 


After Instruction: 


R5 = OFFFFFFCBh = -53 
R7 = 34h = 52 
LUF LV UF NZVC=000000 1 


11-117 


NEGF Negate Floating-Point 


Syntax NEGF <src>,<dst> 

Operation 0 - src > dst 

Operands sre general addressing modes (G): 
00 register (Rn,O <n <7) 
01 direct 
10 indirect 


11 immediate 
dst register (Rn,O <n < 7) 


Encoding 
31 24 23 1615 87 0 


Description — The difference of the 0 and src operands is loaded into the dst register. The 
dst and src operands are assumed to be floating-point numbers. 


Cycles 1 


Status Bits N 1 if a negative result is generated, O otherwise. 

Z 1 if a zero result is generated, O otherwise. 

V 1 if a floating-point overflow occurs, 0 otherwise. 

Cc Unaffected. 

UF $1 if a floating-point underflow occurs, 0 otherwise. 

LV 1 if a floating-point overflow occurs, unchanged otherwise. 


LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 


Example NEGF *++AR3(2),R1 


Before Instruction: 


AR3 = 809800h 

R1 = 057B400025h = 6.28125006e+01 

Data at 809802h = 70C8000h = 1.4050e+02 
LUF LV UF NZVC=0000000 


After Instruction: 


~AR3 = 809802h 
R1 = 07F3800000h = -1.4050e+02 
Data at 809802h = 70C8000h = 1.4050e+02 
LUF LV UF NZV C=0001000 


11-118 


Parallel NEGF and STF NEGF||STF 


Syntax NEGF <srce2>,<dst7> 
|| STF <sre3>,<dst2> 
Operation 0 - src2 > dst1 
| sre3 > dst2 


Operands src2 indirect (disp = O, 1, IRO, IR1) 
dst7 register (Rn1,0 <n1 < 7) 
src3 register (Rn2,0 < n2 < 7) 
dst2_ indirect (disp = 0, 1, IRO, IR1) 


Encoding 
31 24 23 1615 : 87 0 


Saree ey] aw ooops | wee 


Description _ A floating-point negation and a floating-point store are performed in par- 
allel. All registers are read at the beginning and loaded at the end of the 
execute cycle. This means that if one of the parallel operations (STF) reads 
from a register and the operation being performed in parallel (NEGF) writes 
to the same register, then STF accepts as input the contents of the register 
before it is modified by the NEGF. 


lf sre2 and dst2 point to the same location, src2 is read before the write to 


dst2. 
Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Z 1 if a zero result is generated, O otherwise. 
V 1 if a floating-point overflow occurs, 0 otherwise. 


C Unaffected. 

UF 1 if a floating-point underflow occurs, 0 otherwise. 

LV 1 if a floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 


11-119 


NEGF||STF Parallel NEGF and STF 


Example NEGF *AR4--(1),R7 
|| STF R2,*++AR5(1) 


Before Instruction: 


AR4 = 8098E1h 

R7 = Oh 

R2 = 0733C00000h = 1.79750e+02 

AR5 = 809803h 

Data at 8098E1h = 57B400000h = 6.281250e+01 
Data at 809804h = Oh 

LUF LV UF NZVC=000000 0 


After Instruction: 


AR4 = 8098E0Oh 

R7 = 0584C00000h = -6.281250e+01 

R2 = 0733C00000h = 1.79750e+02 

AR5 = 809804h 

Data at 8098E1h = 57B4000h = 6.281250e+01 
Data at 809804h = 733C000h = 1.79750e+02 
LUF LV UF NZVC=0001000 


11-120 


Negate Integer NEGI 


Syntax 
Operation 


Operands 


Encoding 
31 


NEGI <srce>,<dst> 
0 - src > dst 


sre general addressing modes (G): 
00 register (Rn, O <n < 27) 
O01. direct 
10 indirect 
11 immediate 


dst register (Rn,O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


The difference of the O and src operands is loaded into the dst register. The 
dst and src operands are assumed to be signed integers. 


1 


N 1 if a negative result is generated, 0 otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. 

C 1 if a borrow occurs, O otherwise. 

UF 0 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


OVM Operation affected by OVM. 
NEGI 174,R5 (174 = OAEh) 


Before Instruction: 

R5 = ODCh = 220 

LUF LV UF NZV C=0 000000 
After Instruction: 


R5 = OFFFFFF52 = -174 
LUF LV UF NZVC=0001001 


11-121 


NEGI||STI Parallel NEGI and STI 


Syntax NEGI <src2>,<dst1> 
[| STI <sre3>,<dst2> 
Operation 0 - src2 > dst? 
|| src3 > dst2 


Operands src2 indirect (disp = 0, 1, IRO, IR1) 
dst7 register (Rn1,0 <n1 < 7) 
src3 register (Rn2,0 < n2 < 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


Encoding 
31 24 23 1615 87 0 


Description — An integer negation and an integer store are performed in parallel. All reg- 
isters are read at the beginning and loaded at the end of the execute cycle. 
This means that if one of the parallel operations (STI) reads from a register 
and the operation being performed in parallel (NEGI) writes to the same 
register, then STI accepts as input the contents of the register before it is 
modified by the NEGI. 


If src2 and dst2 point to the same location, src2 is read before the write to 


dst2. 
Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 1 if an integer overflow occurs, O otherwise. 
C 1 if a borrow occurs, 0 otherwise. 
UF 0 


LV ‘1 ‘if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 


11-122 


Parallel NEGI and STI NEGI||STI 


Example NEGI *-AR3,R2 
|| STI R2,*AR1++ 


Before Instruction: 


AR3 = 80982Fh 

R2 = 19h = 25 

AR1 = 8098A5h 

Data at 80982Eh = ODCh = 220 

Data at 8098A5h = Oh 

LUF LV UF NZVC=0 000000 


After Instruction: 


AR3 = 80982Fh 

R2 = OFFFFFF24h = -220 

AR1 = 8098A6h 

Data at 80982Eh = ODCh = 220 

Data at 8098A5h = 19h = 25 

LUF LV UF NZV C=0001001 


11-123 


NOP No Operation 


Syntax NOP <srce> 


Operation No ALU or multiplier operations. 
ARn is modified if src is specified in indirect mode. 


Operands sre general addressing modes (G): 
0 O register (no operation) 
1 O indirect (modify ARn, 0 < n <7) 


Encoding 
31 24 23 1615 87 0 


Description _ l|f the src operand is specified in the indirect mode, the specified addressing 
operation is performed and a dummy memory read occurs. If the src oper- 
and is omitted, no operation is performed. 


Cycles 1 


Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example NOP 
Before Instruction: 
PC = 3Ah 
After Instruction: 
PC = 3Bh 

Example NOP *AR3--(1) 


Before Instruction: 
PC = 5h 

AR3 = 809900h 
After Instruction: 


PC = 6h 
AR3 = 8098FFh 


11-124 


Normalize 


Syntax 
Operation 


Operands 


Encoding 
31 


NORM 


NORM <sre>,<dst> 
norm (sre) > dst 


src general addressing modes (G): 
00 register (Rn,O <n < 7) 


QO1. direct 

10. indirect 

11 immediate 

24 23 1615 87 0 


eatorofa] ae Pe 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


The sre operand is assumed to be an unnormalized floating-point number, 
i.e., the implied bit is set equal to the sign bit. The dst is set equal to the 
normalized src operand with the implied bit removed. The dst operand ex- 
ponent is set to the src operand exponent minus the size of the left-shift 
necessary to normalize the src. The dst operand is assumed to be a nor- 
malized floating-point number. 


If src(exp) = -128 and src(man) = O, then dst = 0, Z = 1, and UF = O. If 
sre(exp) = -128 and src(man) # O, then dst = 0, Z = 0, and UF = 1. For 
all other cases of the src, if a floating-point underflow occurs, then 
dst(man) is forced to 0 and dst(exp) = -128. If sre(man) = O, then 
dst(man) = O and dst(exp) = -128. Refer to Section 5.6. 


1 


N 1 if a negative result is generated, O otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 


C Unaffected. 

UF (1 if a floating-point underflow occurs, 0 otherwise. 

LV Unaffected. 

LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


OVM Operation not affected by OVM. 


NORM R1,R2 


Before Instruction: 


R1 = 0400003AF5h 
R2 = 070C800000h 
LUF LV UF NZVC=0000000 


After Instruction: 


R1 = 0400003AF5h 
R2 = F26BD40000h = 1.12451613e-04 
LUF LV UF NZVC=0000000 


11-125 


NOT Bitwise Logical-Complement 


Syntax NOT <srce>,<dst> 

Operation ~sre > dst 

Operands src general addressing modes (G): 
00 register (Rn, O <n < 27) 
01 direct 
10° indirect 


11 immediate 
dst register (Rn, O <n < 27) 
Encoding 
31 24 23 1615 87 0 


ocronso} ae (oe 


Description _ The bitwise logical-complement of the src operand is loaded into the dst 
register. The complement is formed by a logical-NOT of each bit of the src 
operand. The dst and src operands are assumed to be unsigned integers. 


Cycles 1 
Status Bits N MSB of the output. 
Zz 1 if a zero output is generated, 0 otherwise. 
V 0 
C Unaffected. 
UF 0 


LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example NOT @982Ch,R4 


Before Instruction: 


DP = 80h 
R4 = Oh 
Data at 80982Ch = 5E2Fh 


LUF LV UF NZVC=00 00000 
After Instruction: 


DP = 80h 

R4 = OFFFFA1DOh 

Data at 80982Ch = 5E2Fh 

LUF LV UF NZVC=000100 0 


11-126 


Parallel NOT and STI NOT|(STI 


Syntax NOT <srce2>,<dst7> 
I] STI <sre3>,<dst2> 
Operation ~src2 — dst? 
l| src3 > dst2 


Operands src2 indirect (disp = O, 1, IRO, IR1) 
dst7 register (Rn1,0 < n1 < 7) 
src3 register (Rn2,0 < n2 < 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


Encoding 
31 24 23 1615 87 0 


cm OO eee ee eee 


Description _ A bitwise logical-NOT and an integer store are performed in parallel. All 
registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STI) reads from a 
register and the operation being performed in parallel (NOT) writes to the 
same register, then ST| accepts as input the contents of the register before 
it is modified by the NOT. 


If src2 and dst2 point to the same location, src2 is read before the write to 


dst2. 
Cycles 1 
Status Bits N MSB of the output. 
Z 1 if a zero Output is generated, O otherwise. 
V 0 
C Unaffected. 
UF 0 


LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


11-127 


NOT|(STI Parallel NOT and STI 


Example NOT *+AR2,R3 
|| STI R7,*--AR4(IR1) 


Before Instruction: 


AR2 = 8099CBh 

R3 = Oh 

R7 = ODCh = 220 

AR4 = 809850h 

IR1 = 10h 

Data at 8099CCh = OC2Fh 

Data at 809840h = Oh 

LUF LV UF NZVC=0000000 


After Instruction: 


AR2 = 8099CBh 

R3 = OFFFFF3D0h 

R7 = ODCh = 220 

AR4 = 809840h 

IR1 = 10h 

Data at 8099CCh = OC2Fh 

Data at 809840h = ODCh = 220 

LUF LV UF NZVC=0001000 


11-128 


Bitwise Logical-OR OR 


Syntax 
Operation 


Operands 


Encoding 
31 


OR <srce>,<dst> 
dst OR src > dst 


sre general addressing modes (G): 
QO register (Rn,O <n < 27) 
01 direct 
10. indirect 
11 immediate (not sign-extended) 


dst register (Rn,O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


The bitwise logical-OR between the sre and dst operands is loaded into the 
dst register. The dst and src operands are assumed to be unsigned integers. 


1 


N MSB of the output. 

Z 1 if a zero output is generated, 0 otherwise. 
V 

C Unaffected. 

UF 0O 


LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


11-129 


OR Bitwise Logical-OR 


Example OR *++AR1(IR1) ,R2 


Before Instruction: 


AR1 = 809800h 

IR1 = 4h 

R2 = 012560000h 

Data at 809804h = 2BCDh 

LUF LV UF NZVC=000000 0 


After Instruction: 


AR1 = 809804h 

IR1 = 4h 

R2 = 012562BCDh 

Data at 809804h = 2BCDh 

LUF LV UF NZV C=0 000000 


11-130 


Bitwise Logical-OR, 3-Operand OR3 


Syntax 
Operation 


Operands 


Encoding 
31 


OR3 <src2>,<srce1>,<dst> 
src? OR src2 —> dst 


src? three-operand addressing modes (T): 
00 register (Rn1,0 <n1 < 27) 
01 indirect (disp = O, 1, IRO, IR1) 
10 register (Rni,0 < n1 < 27) 
11 indirect (disp = 0, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2,0 < n2 < 27) 
01 register (Rn2,0 < n2 < 27) 
10. indirect (disp = 0, 1, IRO, 1R1) 
11 indirect (disp = 0, 1, IRO, IR1) 


dst register (Rn, O <n < 27) 


24 23 1615 87 0 


COS) he ee ee ee 


Description 


Cycles 
Status Bits 


Mode Bit 


The bitwise logical-OR between the src7 and src2 operands is loaded into 
the dst register. The src7, src2, and dst operands are assumed to be un- 
signed integers. 


1 

N MSB of the output. 

Zz 1 if a zero output is generated, 0 otherwise. 
V 

C Unaffected. 

UF 0O 


LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


11-131 


OR3 Bitwise Logical-OR, 3-Operand 


Example OR3 *++AR1(IR1),R2,R7 


Before Instruction: 
AR1 = 809800h 


IR1 = 4h 
R2 = 012560000h 
R7 = Oh 


Data at 809804h = 2BCDh 
LUF LV UF NZVC=0000000 


After Instruction: 


AR1 = 809804h 

IR1 = 4h 

R2 = 012560000h 

R7 = 012562BCDh 

Data at 809804h = 2BCDh 

LUF LV UF NZ V C=0 000000 


11-132 


Parallel OR3 and STI OR3||STI 


Syntax OR3 <src2>,<src1>,<dst1> 
\| STI <sre3>,<dst2> 
Operation src? OR src2 > dst7 
|| src3 > dst2 
Operands src? register (Rn1,0 < ni < 7) 


src2 indirect (disp = O, 1, IRO, IR1) 
dst7_ register (Rn2,0 < n2 < 7) 
src3 register (Rn3, 0 < n3 < 7) 
dst2 indirect (disp = O, 1, IRO, IR1) 


Encoding 
31 24 23 1615 87 0 


Description _ A bitwise logical-OR and an integer store are performed in parallel. All re- 
gisters are read at the beginning and loaded at the end of the execute cycle. 
This means that if one of the parallel operations (STI) reads from a register 
and the operation being performed in parallel (OR3) writes to the same re- 
gister, then STI accepts as input the contents of the register before it Is 
modified by the OR3. 


If src2 and dst2 point to the same location, src2 is read before the write to 


dst2. 

Cycles 1 

Status Bits N MSB of the output. 
Z 1 if a zero Output is generated, 0 otherwise. 
V 0 


Cc Unaffected. 
UF 0 

LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


11-133 


OR3||STI 


Example 


11-134 


Parallel OR3 and STI 


OR3 *++AR2,R5,R2 
|| STI R6,*AR1-- 


Before Instruction: 


AR2 = 809830h 

R5 = 800000h 

R2 = Oh 

R6 = ODCh = 220 

AR1 = 809883h 

Data at 809831h = 9800h 

Data at 809883h = Oh 

LUF LV UF NZVC=0000000 


After Instruction: 


AR2 = 809831h 

R5 = 800000h 

R2 = 809800h 

R6 = ODCh = 220 

AR1 = 809882h 

Data at 809831h = 9800h 

Data at 809883h = ODCh = 220 

LUF LV UF NZVC=0 000000 


POP Integer 


Syntax 

Operation 
Operands 
oe 


POP <dst> 
*SP-- > dst 
dst register (Rn,O <n < 27) 


24 23 1615 87 


POP 


EY KEERO DS SE COONEY 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


The top of the current system stack is popped and loaded into the dst reg- 
ister. The top of the stack is assumed to be a signed integer. The POP is 


performed with a post decrement of the stack pointer. 


1 


N 1 if a negative result is generated, O otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 

V 0 

C Unaffected. 

UF 0 


LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 
POP R3 


Before Instruction: 


SP = 809856h 

R3 = 012DAh = 4,826 

Data at 809856h = OFFFFODA4h = -62,044 
LUF LV UF NZVC=0000000 


After Instruction: 


SP = 809855h 

R3 = OFFFFODA4h = -62,044 

Data at 809856h = OFFFFODAG4h = -62,044 
LUF LV UF NZV C=0001000 


11-135 


POPF POP Floating-Point 


Syntax POPF <dst> 
Operation *SP-- — dst? 
Operands dst register (Rn,O <n < 7) 
oo 
24 23 1615 87 0 


Description The top of the current system stack is popped and loaded into the dst reg- 
ister. The top of the stack is assumed to be a floating-point number. The 
POP is performed with a post decrement of the stack pointer. 


Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 
V 0 
Cc Unaffected. 
UF 0 


LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example POPF R4 


Before Instruction: 


SP = 80984Ah 

R4 = 025D2E0123h = 6.91186578e+00 

Data at 80984Ah = 5F2C1302h = 5.32544007e+28 
LUF LV UF NZVC=0000000 


After Instruction: 


SP = 809849h 

R4 = 5F2C130200h = 5.32544007e+28 

Data at 80984Ah = 5F2C1302h = 5.32544007e+28 
LUF LV UF NZVC=000000 0 


11-136 


PUSH Integer PUSH 


Syntax PUSH <sre> 
Operation Sie "4+ SP. 
Operands sre register (Rn, O <n < 27) 
ied 
24 23 1615 87 0 


Description _Thecontents of the src register are pushed on the current system stack . The 
src is assumed to be a signed integer. The PUSH is performed with a pre- 
increment of the stack pointer. 


Cycles 1 
Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 
Mode Bit OVM Operation not affected by OVM. 
Example PUSH R6 


Before Instruction: 


SP = 8098AEh 

R6 = 815Bh = 33,115 

Data at 8098AFh = Oh 

LUF LV UF NZVC=0000000 


After Instruction: 


SP = 8098AFh 

R6 = 815Bh = 33,115 

Data at 8098AFh = 815Bh = 33,115 

LUF LV UF NZ V C=0000000 


11-137 


PUSHF PUSH Floating-Point 


Syntax PUSHF <sre> 
Operation src > *++SP 
Operands sre register (Rn, O <n < 7) 
ee 
24 23 1615 87 0 


Description  Thecontents of the src register are pushed on the current system stack . The 
src is assumed to be a floating-point number. The PUSH is performed with 
a preincrement of the stack pointer. 


Cycles 1 


Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example PUSHF R2 


Before Instruction: 


SP = 809801h 

R2 = 025C128081h = 6.87725854e+00 
Data at 809802h = Oh 

LUF LV UF NZV C=0 000000 


After Instruction: 


SP = 809802h 

R2 = 025C128081h = 6.87725854e+00 

Data at 809802h = 025C1280h = 6.87725830e+00 
LUF LV UF NZVC=0000000 


11-138 


Return From Interrupt Conditionally 


Syntax 


Operation 


Operands 
Encoding 


RETlcond 


If cond is true: 
*SP-- ~ PC 
1 > ST(GIE). 

Else, continue. 


None 


24 23 1615 


RETIicond 


87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


A conditional return is performed. If the condition is true, the top of the 
stack is popped to the PC, and 1 is written to the global interrupt enable 
(GIE) bit of the status register. This has the effect of enabling al! interrupts 
for which the corresponding interrupt enable bit is a 1. 


The TMS320C30 provides 20 condition codes that can be used with this 
instruction (see Section 9.1 for a list of condition mnemonics, encoding, 


and flags). 
4 


N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


RETINZ 


Before Instruction: 


PC = 456h 

SP = 809830h 

ST = Oh 

Data at 809830h = 123h 

LUF LV UF NZVC=0000000 


After Instruction: 


PC = 123h 
SP = 80982Fh 
ST = 2000h 


Data at 809830h = 123h 
LUF LV UF NZV C=0 000000 


11-139 


RETScond Return From Subroutine Conditionally 


Syntax RETScond 
Operation If cond is true: 
*SP-- — PC. 
Else, continue. 
Operands None 
eid 
24 23 1615 87 0 


Description A conditional return is performed. If the condition is true, the top of the 
stack is popped to the PC, and 1 is written to the global interrupt enable 
(GIE) bit of the status register. This has the effect of enabling all interrupts 
for which the corresponding interrupt enable bit is a 1. 


The TMS320C30 provides 20 condition codes that can be used with this 
instruction (see Section 9.1 for a list of condition mnemonics, encoding, 
and flags). 


Cycles 4 


Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example RETSGE 


Before Instruction: 


PC = 123h 

SP = 80983Ch 

Data at 80983Ch = 456h 

LUF LV UF NZVC=0 000000 


After Instruction: 


PC = 456h 

SP = 80983Bh 

Data at 80983Ch = 456h 

LUF LV UF NZVC=0000000 


11-140 


Round Floating-Point RND 


Syntax RND <srce>,<dst> 

Operation rnd(src) > dst 

Operands src general addressing modes (G): 
00 register (Rn,O <n < 7) 
O01. direct 
10. indirect 


11 immediate 
dst register (Rn,O <n < 7) 
Encoding 
31 24 23 1615 87 0 


Description — The result of rounding the sre operand is loaded into the dst register. The 
sre operand is rounded to the nearest single-precision floating-point value. 
If the src operand is exactly half-way between two single-precision values, 
it is rounded to the most positive of those values. 


Cycles 1 

Status Bits N 1 if a negative result is generated, 0 otherwise, 
Z 1 if a zero result is generated, 0 otherwise. 
V 1 if a floating-point overflow occurs, 0 otherwise. 
C Unaffected. 


UF $1 if a floating-point underflow occurs, 0 otherwise. 
LV $1 if a floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 


Example RND R5,R2 


Before Instruction: 


R5 = 0733C16EEFh = 1.79755599e+02 
R2 = Oh 
LUF LV UF NZVC=0000000 


After Instruction: 


R5 = 0733C16EEFh = 1.79755599e+02 
R2 = 0733C16F00h = 1.79755600e+02 
LUF LV UF NZVC=0000000 


11-141 


ROL Rotate Left 


Syntax ROL <dst> 


Operation dst \left-rotated 1 bit > dst 
Operands dst register (Rn,O <n < 27) 
ipna 
24 23 1615 87 0 


Description  Thecontents of the dst operand are left-rotated one bit and loaded into the 
dst register. This rotate is a circular rotate with the MSB transferred into the 


LSB. 
Rotate left: 
C + ast 
Cycles 1 
Status Bits N MSB of the output. 
; : if a zero output is generated, 0 otherwise. 


Cc Set to the value of the bit rotated out of the high-order bit. Unaf- 
fected if dst is not RO-R7. 

UF 0 

LV Unaffected. 

LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example ROL R3 


Before Instruction: 


R3 = 80025CD4h 
LUF LV UF NZV C=0000000 


After Instruction: 


R3 = 0004B9A9h 
LUF LV UF NZV C=00000 01 
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Rotate Left Through Carry ROLC 


Syntax ROLC <dst> 

Operation dst left-rotated 1 bit through carry bit > dst 

Operands dst register (Rn, O <n < 27) 

Encoding 

31 24 23 1615 87 0 


100100 1a] dst 0000000000000 0 0 1 


Description The contents of the dst operand are left-rotated one bit through the carry 
bit and loaded into the dst register. The MSB is rotated to the carry bit, at 
the same time the carry bit is transferred to the LSB. 


Rotate left through carry bit: 


Bae 


Cycles 1 


Status Bits MSB of the output. 


N 
Z 1 if a zero output is generated, 0 otherwise. 

V 

C Set to the value of the bit rotated out of the high-order bit. If dst is 
not RO-R7, then C is shifted into the dst but not changed. 

UF 0O 

LV Unaffected. 

LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example ROLC R3 


Before Instruction: 


R3 = 00000420h 
LUF LV UF NZV C=00000 0 1 


After Instruction: 
R3 = 000000841h 
LUF LV UF NZV C=0 000000 


Example ROLC R3 


Before Instruction: 


R3 = 80004281h 
LUF LV UF NZVC=0000000 


After Instruction: 


R3 = 00008502h 
LUF LV UF NZVC=000000 1 
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ROR Rotate Right 


Syntax ROR <dst> 

Operation dst right-rotated 1 bit through carry bit > dst 

Operands dst register (Rn, O <n < 27) 

Encoding 

31 24 23 1615 87 0 


Description The contents of the dst operand are right-rotated one bit and loaded into 
the dst register. The LSB is rotated into the carry bit and also transferred 
into the MSB. 


Rotate right: 


ee ee 


Cycles 


Status Bits MSB of the output. 


1 

N 

Z 1 if a zero output is generated, 0 otherwise. 
V 0 

Cc 


Set to the value of the bit rotated out of the low-order bit. Unaf- 
fected if dst is not RO-R7. 

UF 0O 

LV Unaffected. 

LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example ROR R77 


Before Instruction: 

R7 = 00000421h 

LUF LV UF NZ V C=0 000000 
After Instruction: 


R7 = 80000210h 
LUF LV UF NZVC=000100 1 
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Rotate Right Through Carry RORC 


Syntax RORC <dst> 

Operation dst right-rotated 1 bit through carry bit > dst 

Operands dst register (Rn, O <n < 27) 

Encoding 

31 24 23 1615 87 0 


EE Ee) a (Re 


Description  Thecontents of the dst operand are right-rotated one bit through the carry 
bit and loaded into the dst register. The LSB is rotated into the carry bit. 
At the same time, the carry bit is transferred into the MSB. 


Rotate right through carry bit: 


aa 


—_ 


Cycles 


Status Bits N MSB of the output. 

Zz 1 if a zero output is generated, 0 otherwise. 

V 0 

C Set to the value of the bit rotated out of the low-order bit. If dst is 
not RO-R7, then C is shifted in but not changed. 

UF 0O 

LV Unaffected. 

LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example RORC R4 


Before Instruction: 


R4 = 00000081h 
LUF LV UF NZV C=000000 1 


After Instruction: 


R4 = 80000040h 
LUF LV UF NZVC=0001 00 1 
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RPTB 


Syntax 


Operation 


Operands 
Encoding 
31 


Repeat Block 


RPTB <src> 
src > RE 

1 7+ ST(RM) 
Next PC ~ RS 


src long-immediate addressing mode 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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RPTB allows a block of instructions to be repeated a number of times 
without any penalty for looping. This instruction activates the block repeat 
mode of updating the PC. The src operand is a 24-bit unsigned immediate 
value that is loaded into the repeat end address (RE) register. A 1 is written 
into the repeat mode bit of status register ST(RM) to indicate that the PC 
is being updated in the repeat mode. The address of the next instruction is 
loaded into the repeat start address (RS) register. | 


4 


N Unaffected. 
Z Unaffected. 
V Unaffected. 
Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


RPTB 127h 
Before Instruction: 
PC = 123h 

ST = Oh 

RE = Oh 

RS = Oh 


LUF LV UF NZVC=0000000 


After Instruction: 


PC = 124h 
ST = 100h 
RE = 127h 
RS = 124h 


LUF LV UF NZVC=0000000 


Repeat Single RPTS 


Syntax 


Operation 


Operands 


Encoding 
31 


RPTS <src> 


src > RC 

1 > ST(RM) 
1-S 

Next PC > RS 
Next PC ~ RE 


src general addressing modes (G): 
register 

direct 

indirect 

immediate 


-"0O-o 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


The RPTS instruction allows a single instruction to be repeated a number 
of times without any penalty for looping. Fetches can also be made from the 
instruction register (IR), thus avoiding repeated memory access. 


The sre operand is loaded into the repeat counter (RC). A 1 is written into 
the repeat mode bit of the status register ST(RM). A 1 is also written into 
the repeat single bit (S). This indicates that the program fetches are to be 
performed only from the instruction register. the repeat single mode. The 
next PC is loaded into the repeat end address (RE) register and the repeat 
start address (RS) register. 


The src operand is assumed to be an unsigned integer and is not sign-ex- 
tended for immediate mode. 


4 


N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 
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RPTS | Repeat Single 


Example RPTS AR5 


Before Instruction: 


PC = 123h 

ST = Oh 

RS = Oh 

RE = Oh 

RC = Oh 

AR5 = OFFh 

LUF LV UF NZVC=0000000 


After Instruction: 


PC = 124h 
ST = 100h 
RS = 124h 
RE = 124h 
RC = OFFh 
AR5 = OFFh 


LUF LV UF NZV C=0000000 
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Signal, Interlocked SIGI 


Syntax SIGI 


Operation Signal interlocked operation. 
Wait for interlock acknowledge. 
Clear interlock. 


Operands None 
— 
24 23 1615 87 0 


Description _ An interlocked operation is signaled over XFO and XF1. After the interlocked 
operation is acknowledged, the interlocked operation ends. SIGI ignores 
the external ready signals. Refer to Section 7.3 for detailed information. 


Cycles 1 


Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
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STF Store Floating-Point 


Syntax STF <src>,<dst> 
Operation src > dst 
Operands sre register (Rn,O <n < 7) 
dst general addressing modes (G): 

01. direct 

10 indirect 
Encoding 
31 24 23 1615 87 0 


Description _ The src register is loaded into the dst memory location. The src and dst 
operands are assumed to be floating-point numbers. 


Cycles 1 


Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example STF R2,@98Al1h 


Before Instruction: 


DP = 80h 

R2 = 052C501900h = 4.30782204e+01 
Data at 8098A1h = Oh 

LUF LV UF NZV C=0 000000 


After Instruction: 


DP = 80h 

R2 = 052C501900h = 4.30782204e+01 

Data at 8098A1h = 52C5019h = 4.30782204e+01 
LUF LV UFNZVC=0000000 
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Store Floating-Point, Interlocked STFI 


Syntax STFI <sre>,<dst> 
Operation src > dst 
Signal end of interlocked operation. 
Operands sre register (Rn, O <n < 7) 
dst general addressing modes (G): 
01 direct 
10 indirect 
Encoding 
31 24 23 1615 87 0 


ee eee on 


Description — The src register is loaded into the dst memory location. An interlocked op- 
eration is signaled over pins XFO and XF1. The sre and dst operands are 
assumed to be floating-point numbers. Refer to Section 7.3 for detailed 


information. 
Cycles 1 
Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 


UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example STFI R3,*-AR4 


Before Instruction: 


R3 = 0733C00000h = 1.79750e+02 

AR4 = 80993Ch 

Data at 80993Bh = Oh 

LUF LV UF NZV C=0 000000 


After Instruction: 


R3 = 0733C00000h = 1.79750e+02 

AR4 = 80993Ch 

Data at 80993Bh = 733C000h = 1.79750e+02 
LUF LV UF NZVC=0000000 
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STF||STF 


Syntax 
Operation 


Operands 


Encoding 
31 


Parallel STF and STF 


STF <src2>,<dst2> 
|| STF <srce7>,<dsti> 


src2 > dst2 
|| src? > dst’ 


src? register (Rn1,0 < n1 < 7) 
dst7 indirect (disp = 0, 1, IRO, IR1) 
src2__ register (Rn2, 0 < n2 < 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 
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Two floating-point stores are performed in parallel. If both stores are exe- 
cuted to the same address, the value written is that of STF 
<src2>,<dst2>. 


1 
N Unaffected. 
Zz Unaffected. 
V Unaffected. 
C Unaffected. 


UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


Parallel STF and STF STF||STF 


Example STF R4,*AR3-- 
|| STF R3,¥*++AR5 


Before Instruction: 


R4 = 070C800000h = 1.4050e+02 

AR3 = 809835h 

R3 = 0733C00000h = 1.79750e+02 

AR5 = 8099D2h 

Data at 809835h = Oh 

Data at 8099D3h = Oh 

LUF LV UF NZ V C=0 000000 


After Instruction: 


R4 = 070C800000h = 1.4050e+02 
AR3 = 809834h | 

R3 = 0733C00000h = 1.79750e+02 

AR5 = 8099D3h 

Data at 809835h = 070C8000h = 1.4050e+02 
Data at 8099D3h = 0733C000h = 1.79750e+02 
LUF LV UENZVC=0000000 
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STI Store Integer 


Syntax STI <sre>,<dst> 


Operation src > dst 
Operands sre register (Rn, O <n ¢ 27) 
dst general addressing modes (G): 
01 direct 
10 indirect 
Encoding 
31 24 23 1615 87 0 


Description — The src register is loaded into the dst memory location. The src and dst 
operands are assumed to be signed integers. 


Cycles 1 


Status Bits N Unaffected. 
Zz Unaffected. 
V Unaffected. 
Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
Example STI R4,@982Bh 


Before Instruction: 


DP = 80h 

R4 = 42BD7h = 273,367 

Data at 80982Bh = OESFCh = 58,876 

LUF LV UF NZVC=0000000 


After Instruction: 


DP = 80h 

R4 = 42BD7h = 273,367 

Data at 80982Bh = 42BD7h = 273,367 
LUF LV UF NZVC=0000000 


11-154 


Store Integer, Interlocked STII 


Syntax STII <sre>,<dst> 
Operation src > dst 
Signal end of interlocked operation. 
Operands src register (Rn,O <n < 27) 
dst general addressing modes (G): 
01 direct 
10 indirect 
Encoding 
31 24 23 1615 87 0 


Description The src register is loaded into the dst memory location. An interlocked op- 
eration ts signaled over pins XFO and XF1. The src and dst operands are 
assumed to be signed integers. Refer to Section 7.3 for detailed information. 


Cycles 1 

Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 


Cc Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example STII R1,@98AEh 


Before Instruction: 
DP = 80h 


R1 = 78Dh 
Data at 8098AEh = 25Ch 


After Instruction: 


DP = 80h 
R1 = 78Dh 
Data at 8098AEh = 7BDh 
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STI||STI Parallel STI and STI 


Syntax STI <src2>,<dst2> 

|| STL <sre7>,<dst1> 
Operation src2 ~ dst2 

\| src? > dst 
Operands src? register (Rn1,0 < n1 < 7) 


dsti_ indirect (disp = 0, 1, JRO, IR1) 
src2 register (Rn2,0 < n2 < 7) 
dst2 indirect (disp = O, 1, IRO, IR1) 


Encoding 
31 24 23 1615 87 0 


Description Two integer stores are performed in parallel. If both stores are executed to 
the same address, the value written is that of STI <src2>,<dst2>. 


Cycles 1 


Status Bits N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
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Parallel STI and STI STI||STI 


Example STI RO,*++AR2(IRO) 
|| STI R5,*ARO 


Before Instruction: 


RO = ODCh = 220 

AR2 = 809830h 

IRO = 8h 

R5 = 35h = 53 

ARO = 8098D3h 

Data at 809838h = Oh 

Data at 8098D3h = Oh 

LUF LV UF NZV C=0 000000 


After Instruction: 


RO = ODCh = 220 

AR2 = 809838h 

IRO = 8h 

R5 = 35h = 53 

ARO = 8098D3h 

Data at 809838h = ODCh = 220 

Data at 8098D3h = 35h = 53 

LUF LV UF NZVC=0000000 


11-157 


SUBB 


Syntax 
Operation 


Operands 


Encoding 
31 


Subtract Integer with Borrow 


SUBB <src>,<dst> 
dst - srce-C- dst 


src general addressing modes (G): 
00 register (Rn,O <n < 27) 
01 direct 
10 indirect 
11 immediate 


dst register (Rn, O <n < 27) 


24 23 1615 87 0 


CC Co ee ee eee 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 
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The difference of the dst, src, and C operands is loaded into the dst register. 
The dst and src operands are assumed to be signed integers. 


1 


N 1 if a negative result is generated, 0 otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 

V 1 if an integer overflow occurs, 0 otherwise. 

C 1 if a borrow occurs, 0 otherwise. 

UF 0O 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


OVM Operation affected by OVM. 


SUBB *AR5++(4),R5 


Before Instruction: 


ARD5d = 809800h 

R5 = OFAh = 250 

Data at 809800h = OC7h = 199 

LUF LV UF NZVC=0 000001 


After Instruction: 


AR5 = 809804h 

R5 = 032h = 50 

Data at 809800h = OC7h = 199 

LUF LV UFNZVC=0 000000 


Subtract Integer with Borrow, 3-Operand SUBB3 


Syntax SUBB3 <src2>,<src1>,<dst> 
Operation src? - src2 - C > dst 
Operands src? three-operand addressing modes (T): 


00 register (Rn1,0 <n1 < 27) 
01 indirect (disp = 0, 1, IRO, IR1) 
10 register (Rni,O < ni < 27) 
11 indirect (disp = 0, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2,0 < n2 < 27) 
01 register (Rn2,0 < n2 < 27) 
10 indirect (disp = O, 1, IRO, IR1) 
11 indirect (disp = O, 1, IRO, IR1) 


dst register (Rn, O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description — The difference of the src7 and src2 operands and the C (carry) flag is 
loaded into the dst register. The src7, src2, and dst operands are assumed 
to be signed integers. 


Cycles 1 
Status Bits N 1 if a negative result is generated, 0 otherwise. 
Zz 1 if a zero result is generated, O otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
C 1 if a borrow occurs, 0 otherwise. 
UF 0O 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 
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SUBB3 Subtract Integer with Borrow, 3-Operand 


Example SUBB3 R5,*AR5++(IRO) ,RO 


Before Instruction: 


AR5 = 809800h 

IRO = 4h 

R5 = 0C7h = 199 

RO = Oh 

Data at 809800h = OFAh = 250 

LUF LV UF NZV C=00000 0 1 


After Instruction: 
AR5 = 809804h 


IRO = 4h 
R5 = OC7h = 199 
RO = 32h = 50 


Data at 809800h = OFAh = 250 
LUF LV UF NZV C=000000 0 
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Subtract Integer Conditionally SUBC 


Syntax 


Operation 


Operands 


Encoding 
31 


SUBC <src>,<dst> 


If (dst - src > Q): 

(dst - src << 1) OR1 > ast 
Else: 

dst << 1 7 dst 


src general addressing modes (G): 
00 register (Rn,O <n < 27) 
01 direct 
10 indirect 
11 immediate 


dst register (Rn,O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


A subtraction of the src operand from the dst operand is performed. The 
dst operand is loaded with a value dependent upon the result of the sub- 
traction. If (dst - src) is greater than or equal to zero, then (dst - src) is 
left-shifted one bit, the least-significant bit is set to 1, and the result ts 
loaded into the dst register. If (dst - src) is less than zero, dst is left-shifted 
one bit and loaded into the dst register. The dst and src operands are as- 
sumed to be unsigned integers. 


SUBC may be used to perform a single step of a multi-bit integer division. 
See Section 12.3.3 for a detailed description. 


1 


N Unaffected. 
Z Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 
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SUBC Subtract Integer Conditionally 


Example SUBC @98C5h,R1 


Before Instruction: 


DP = 80h 

R1 = O4F6h = 1270 

Data at 8098C5h = 492h = 1170 

LUF LV UF NZVC=0000000 


After Instruction: 


DP = 80h 

R1 = OC9h = 201 

Data at 8098C5h = 492h = 1170 

LUF LV UF NZVC=0000000 


Example SUBC 3000,RO (3000 = OBB8h) 


Before Instruction: 

RO = 07D0h = 2000 

LUF LV UF NZVC=0 000000 
After Instruction: 


RO = OFAOh = 4000 
LUF LV UF NZVC=0000000 
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Subtract Floating-Point SUBF 


Syntax SUBF <srce>,<dst> 

Operation dst - src > dst 

Operands sre general addressing modes (G): 
00 register (Rn,O <n < 7) 
01 direct 
10° indirect 


11 immediate 
dst register (Rn, O <n < 7) 
Encoding 
31 24 23 1615 87 0 


Description The dst operand minus the sre operand is loaded into the dst register. The 
dst and src operands are assumed to be floating-point numbers. 


Cycles 1 

Status Bits N 1 if a negative result is generated, O otherwise. 
2 1 if a zero result is generated, 0 otherwise. 
V 1 if an floating-point overflow occurs, 0 otherwise. 
Cc Unaffected 


UF $1 if a floating-point underflow occurs, 0 otherwise. 
LV 1 if an floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 


Example SUBF *ARO--(IRO),R5 


Before Instruction: 


ARO = 809888h 

IRO = 80h 

R5 = 0733C00000h = 1.79750000e+02 
Data at 809888h = 70C8000h = 1.4050e+02 
LUF LV UFNZVC=+=0000000 


After Instruction: 


ARO = 809808h 

IRO = 80h 

R5 = 051D000000h = 3.9250e+01 

Data at 809888h = 70C8000h = 1.4050e+02 
LUF LV UF NZVC=0000000 
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SUBF3 


Syntax 
Operation 


Operands 


Encoding 
31 


Subtract Floating-Point, 3-Operand 


SUBF3 <src2>,<src1>,<dst> 
src? - src2 > dst 


src? three-operand addressing modes (T): 
00 register (Rn1, < n1 <7) 
01 indirect (disp = O, 1, IRO, IR1) 
10 register (Rn1, < n1 < 7) 
11 indirect (disp = 0, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2, < n2 < 7) 
01 register (Rn2, < n2 < 7) 
10 indirect (disp = 0, 1, IRO, IR1) 
11 indirect (disp = 0, 1, IRO, IR1) 


dst register (Rn, O <n < 7) 


24 23 1615 87 0 


oossont ry ae {et Pe 


Description 


Cycles 
Status Bits 


Mode Bit 
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The difference of the src7 and src2 operands is loaded into the dst register. 
The srce7, src2, and dst operands are assumed to be floating-point numbers. 


1 


N 1 if a negative result is generated, O otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 
V 1 if a floating-point overflow occurs, 0 otherwise. 


Cc Unaffected. 

UF $1 if a floating-point underflow occurs, 0 otherwise. 

LV 1 tf a floating-point overflow occurs, unchanged otherwise. 
LUF $1 if a floating-point underflow occurs, unchanged otherwise. 


OVM Operation not affected by OVM. 


Subtract Floating-Point, 3-Operand SUBF3 


Example SUBF3 *ARO--(IRO),*AR1,R4 


Before Instruction: 
ARO = 809888h 


IRO = 80h 
AR1 = 809851h 
R4 = Oh 


Data at 809888h = 70C8000h = 1.4050e+02 
Data at 809851h = 733C000h = 1.79750e+02 
LUF LV UFNZVC=0 000000 


After Instruction: 


ARO = 809808h 

IRO = 80h 

AR1 = 809851h 

R4 = 51D000000h = 3.9250e+01 

Data at 809888h = 70C8000h = 1.4050e+02 
Data at 809851h = 733C000h = 1.79750e+02 
LUF LV UF NZVC=0 000000 


Example SUBF3 R7,RO,R6 


Before Instruction: 


R7 = 57B400000h = 6.281250e+01 

RO = 34C200000h = 1.27578125e+01 

R6 = Oh 

LUF LV UF NZVC=0000000 


After Instruction: 


R7 = 57B400000h = 6.281250e+01 

RO = 34C200000h = 1.27578125e+01 

R6 = 5B7C80000h = -5.00546875e+01 
LUF LV UF NZVC=0001000 
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SUBF3||STF Parallel SUBF3 and STF 


Syntax SUBF3 <src1>,<src2>,<dst1> 
| STF <sre3>,<dst2> 
Operation src2 - srci > dst\ 
[| src3 > dst2 
Operands src? register (Rni,0 < n1 < 7) 


src2 indirect (disp = 0, 1, IRO, IR1) 
dst? register (Rn2,0 < n2 < 7) 
src3 register (Rn3,0 < n3 < 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


Encoding 
31 24 23 1615 8 


oftfo sos} deer | arot | orcs | gee | eee 


Description _ A floating-point subtraction and a floating-point store are performed in 
parallel. All registers are read at the beginning and loaded at the end of the 
execute cycle. This means that if one of the parallel operations (STF) reads 
from a register and the operation being performed in parallel (SUBF3) 
writes to the same register, then STF accepts as input the contents of the 
register before it is modified by the SUBFS. 


lf src2 and dst2 point to the same location, src2 is read before the write to 


dst2. 
Cycles 1 
Status Bits N 1 if a negative result is generated, 0 otherwise. 
2 1 if a zero result is generated, O otherwise. 
V 1 if a floating-point overflow occurs, 0 otherwise. 


C Unaffected. 

UF $1 if a floating-point underflow occurs, 0 otherwise. 

LV 1 if a floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 
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Parallel SUBF3 and STF SUBF3||STF 


Example SUBF3 R1,*-AR4(IR1),RO 
|| STF R7,*+AR5(IRO) 


Before Instruction: 


R1 = 057B400000h = 6.28125e+01 

AR4 = 8098B8h 

IR1 = 8h 

RO = Oh 

R7 = 0733C00000h = 1.79750e+02 

AR5 = 809850h 

IRO = 10h 

Data at 8098BOh = 70C8000h = 1.4050e+02 
Data at 809860h = Oh 

LUF LV UF NZV C=000000 0 


After Instruction: 


R1 = 057B400000h = 6.28125e+01 

AR4 = 8098B8h 

IR1 = 8h 

RO = 061B600000h = 7.768750e+01 

R7 = 0733C00000h = 1.79750e+02 

AR5 = 809850h 

IRO = 10h 

Data at 8098BOh = 70C8000h = 1.4050e+02 
Data at 809860h = 733C000h = 1.79750e+02 
LUF LV UF NZVC=0000000 
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SUBI Subtract Integer 


Syntax SUBI <src>,<dst> 

Operation dst - src > dst 

Operands src general addressing modes (G): 
00 register (Rn,O <n < 27) 
01 direct 
10. indirect 


11 immediate 
dst register (Rn, O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description — The dst operand minus the src operand is loaded into the dst register. The 
dst and src operands are assumed to be signed integers. 


Cycles 1 
Status Bits N 1 if a negative result is generated, O otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
Cc 1 if a borrow occurs, 0 otherwise. 
UF 0 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 


Example SUBI 220,R7 


Before Instruction: 


R7 = 226h = 550 
LUF LV UF NZVC=000000 0 


After Instruction: 


R7 = 14Ah = 330 
LUF LV UFNZVC=0 000000 
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Subtract Integer, 3-Operand SUBI3 


Syntax SUBI3 <src2>,<srce1>,<dst> 
Operation src? - src2 > dst 
Operands src? three-operand addressing modes (T): 


00 register (Rn1,0 <n1 < 27) 
01 indirect (disp = 0, 1, IRO, IR1) 
10 register (Rn1,0 < n1 < 27) 
11 indirect (disp = O, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2,0 < n2 < 27) 
01 register (Rn2,0 < n2 < 27) 
10 indirect (disp = O, 1, IRO, IR1) 
11 indirect (disp = O, 1, IRO, IR1) 


dst register (Rn, O <n < 27) 
Encoding 
31 24 23 1615 87 0 


oorasofr] oe [oe | le 


Description _Thesrc7 operand minus the src2 operand is loaded into the dst register. The 
src?, src2, and dst operands are assumed to be signed integers. 


Cycles 1 
Status Bits N 1 if a negative result is generated, 0 otherwise. 
Z 1 if a zero result is generated, 0 otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
C 1 if a borrow occurs, 0 otherwise. 
UF 0O 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 


Example SUBI3 R7,R2,R0 


Before Instruction: 


R2 = 0866h = 2150 

R7 = 0834h = 2100 

RO = Oh 

LUF LV UF NZV C=0000000 


After Instruction: 


R2 = 0866h = 2150 

R7 = 0834h = 2100 

RO = 032h = 50 

LUF LV UF NZVC=0001000 
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SUBI3 Subtract Integer, 3-Operand 


Example SUBI3 *-AR2(1),R4,R3 


Before Instruction: 


AR2 = 80985Eh 

R4 = 0226h = 550 

R3 = Oh 

Data at 80985Dh = ODCh = 220 

LUF LV UF NZVC=0000000 


After Instruction: 


AR2 = 80985Eh 

R4 = 0226h = 550 

R3 = 014Ah = 330 

Data at 80985Dh = ODCh = 220 

LUF LV UF NZV C=0 000000 
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Parallel SUBI3 and STI SUBI3||STI 


Syntax 


Operation 


Operands 


Encoding 
3 


1 


SUBI3 <src1>,<src2>,<dsti> 
[| STl <src3>,<dst2> 


src2 - src7 > dst 
|| sre3 > dst2 


src? register (Rn1,0 < n1 < 7) 
src2_ indirect (disp = O, 1, 1RO, IR1) 
dst? register (Rn2,0 < n2 < 7) 
src3 register (Rn3, 0 < n3 < 7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


An integer subtraction and an integer store are performed in parallel. All 
registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STI) reads from a 
register and the operation being performed in parallel (SUBI3) writes to the 
same register, then ST! accepts as input the contents of the register before 
it is modified by the SUBIS. 


If src2 and dst2 point to the same location, src2 is read before the write to 
dst2. 


4 
N 1 if a negative result is generated, 0 otherwise. 

Zz 1 if a zero result is generated, 0 otherwise. 

V 1 tf an integer overflow occurs, 0 otherwise. 

Cc 1 if a borrow occurs, 0 otherwise. 

UF 0 

LV 1 if an integer overflow occurs, unchanged otherwise. 


LUF Unaffected. 
OVM Operation affected by OVM. 
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SUBI3||STI 


Example 
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Parallel SUBI3 and STI 


SUBI3 R7,*+AR2(IRO),R1 
|| STI R3,*++AR7 


Before Instruction: 


R7 = 14h = 20 

AR2 = 80982Fh 

IRO = 10h 

R1 = Oh 

R3 = 35h = 53 

AR7 = 80983Bh 

Data at 80983Fh = ODCh = 220 

Data at 80983Ch = Oh 

LUF LV UF NZV C=0000000 


After Instruction: 


R7 = 14h = 20 
AR2 = 80982Fh 
JRO = 10h 


R1 = OC8h = 200 


R3 = 35h = 53 


AR7 = 80983Ch 

Data at 80983Fh = ODCh = 220 

Data at 80983Ch = 35h = 53 

LUF LV UF NZV C=0 000000 


Subtract Reverse Integer with Borrow SUBRB 


Syntax SUBRB <src>,<dst> 

Operation strc -dst- C > dst 

Operands src general addressing modes (G): 
00 register (Rn,O <n < 27) 
O01 direct 
10 indirect 


11 immediate 
dst register (Rn,O <n < 27) 
Encoding 
31 24 23 1615 87 0 


psooosfal ae foe 


Description The difference of the src, dst, and C operands is loaded into the dst register. 
The dst and src operands are assumed to be signed integers. 


Cycles 1 

Status Bits N 1 if a negative result is generated, 0 otherwise. 
Z 1 if a zero result is generated, O otherwise. 
V 1 if an integer overflow occurs, 0 otherwise. 
Cc 1 if a borrow occurs, 0 otherwise. 


UF 0 
LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 


Example SUBRB R4,R6 


Before Instruction: 


R4 = O3CBh = 971 
R6 = 0258h = 600 
LUF LV UF NZVC=000000 1 


After Instruction: 


R4 = O3CBh = 971 
R6 = 0172h = 370 
LUF LV UF NZVC=0000000 
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SUBRF Subtract Reverse Floating-Point 


Syntax SUBRF <src>,<dst> 

Operation stc - dst > dst 

Operands sre general addressing modes (G): 
00 register (Rn,O <n < 7) 
01. direct 
10. indirect 


11 immediate 
dst register (Rn, O <n < 7) 
Encoding 
31 24 23 1615 87 0 


Description The src operand minus the dst operand is loaded into the dst register. The | 
dst and src operands are assumed to be floating-point numbers. 


Cycles 1 
Status Bits N 1 if a negative result is generated, 0 otherwise. 
Zz 1 if a zero result is generated, 0 otherwise. 
V 1 if a floating-point overflow occurs, 0 otherwise. 


C Unaffected. 

UF 1 if a floating-point underflow occurs, 0 otherwise. 

LV $1 if a floating-point overflow occurs, unchanged otherwise. 
LUF 1 if a floating-point underflow occurs, unchanged otherwise. 


Mode Bit OVM Operation not affected by OVM. 


Example SUBRF @9905h,R5 


Before Instruction: 


DP = 80h 

R5 = 057B400000h = 6.281250e+01 

Data at 809905h = 733C000h = 1.79750e+02 
LUF LV UF NZVC=0000000 


After Instruction: 


DP = 80h 

R5 = O0669E00000h = 1.16937500e+02 

Data at 809905h = 733C000h = 1.79750e+02 
LUF LV UF NZVC=0000000 
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Subtract Reverse Integer SUBRI 


Syntax SUBRI <sre>,<dst> 

Operation src - dst > dst 

Operands src general addressing modes (G): 
OO register (Rn,O <n s 27) 
01. direct 
10. indirect 


11 immediate 
dst register (Rn,O <n < 27) 
Encoding 
31 24 23 1615 87 0 


rooney] oe fe 


Description The sre operand minus the dst operand is loaded into the dst register. The 
dst and src operands are assumed to be signed integers. 


Cycles 1 

Status Bits N 1 if a negative result is generated, O otherwise. 
Z 1 if a zero result is generated, O otherwise. 
V 1 if an integer overflow occurs, O otherwise. 
Cc 1 if a borrow occurs, 0 otherwise. 
UF 0 


LV 1 if an integer overflow occurs, unchanged otherwise. 
LUF Unaffected. 


Mode Bit OVM Operation affected by OVM. 


Example SUBRI *AR5++(IRO) ,R3 


Before Instruction: 


AR5 = 809900h 

IRO = 8h 

R3 = ODCh = 220 

Data at 809900h = 226h = 550 

LUF LV UF NZV C=0 000000 


After Instruction: 


AR5 = 809908h 

IRO = 8h 

R3 = 014Ah = 330 

Data at 809900h = 226h = 550 

LUF LV UF NZVC=000000 0 
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SWI Software Interrupt 


Syntax SWI 

Operation Performs an emulation interrupt. 

Operands None 

Encoding 

31 24 23 1615 87 0 


01100171/0000000000000000000000000 


Description The SWI instruction performs an emulator interrupt. This is a reserved in- 
struction and should not be used in normal programming. 


Cycles 4 

Status Bits N Unaffected. 
Zz Unaffected. 
V Unaffected. 


C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
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Trap Conditionally TRAPcond 


Syntax 


Operation 


Operands 
Encoding 
31 


TRAPcond N 


0 > ST(GIE) 
lf cond is true: 
Next PC >~ *++SP 
Trap vector N ~ PC 
Else: 
Set ST(GIE) to original state 
Continue. 


N (O<N < 31) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


Interrupts are disabled globally when O is written to ST(GIE). If the con- 
dition is true, the contents of the PC are pushed on the system stack and 
the PC is loaded with the contents of the specified trap vector (N). If the 
condition is not true, ST(GIE) is set to its value before the TRAPcond in- 
struction changed it. 


The TMS320C30 provides 20 condition codes that can be used with this 
instruction (see Section 9.1 for a list of condition mnemonics, encoding, 
and flags). 


5 


N Unaffected. 
Zz Unaffected. 
V Unaffected. 
C Unaffected. 
UF Unaffected. 
LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


TRAPZ 16 


Before Instruction: 


PC = 123h 
SP = 809870h 
ST = Oh 


Trap Vector 16 = 10h 
LUF LV UF NZ V C=0 000000 


After Instruction: 


PC = 10h 

SP = 809871h 

Data at 809871h = 124h 
ST = Oh 


LUF LV UF NZVC=0000000 
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TSTB Test Bit Fields | 


Syntax TSTB <srce>,<dst> 
Operation dst AND src 


Operands src general addressing modes (G): 
00 register (Rn, 0 <n < 27) 
01. direct 
10. indirect 
11 immediate 


dst register (Rn,O <n < 27) 
Encoding 
31 24 23 1615 87 0 


Description The bitwise logical-AND of the dst and src operands is formed, but the re- 
sult is not loaded in any register. This allows for nondestructive compares. 
The dst and src operands are assumed to be unsigned integers. 


Cycles 1 


Status Bits N MSB of the output. 
Zz 1 if a zero output is generated, 0 otherwise. 
V 
Cc Unaffected. 
UF 0O 
LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 


Example TSTB *-AR4(1),R5 


Before Instruction: 


AR4 = 8099C5h 

R5 = 898h = 2200 

Data at 8099C4h = 767h = 1895 

LUF LV UF NZVC=0000000 


After Instruction: 


AR4 = 8099C5h 

R5 = 898h = 2200 

Data at 8099C4h = 767h = 1895 

LUF LV UF NZVC=0000100 
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Test Bit Fields, 3-Operand TSTB3 


Syntax 
Operation 


Operands 


Encoding 


Description 


Cycles 
Status Bits 


Mode Bit 


TSTB3 <src2>,<srce1> 
src? AND src2 


src? three-operand addressing modes (T): 
register (Rn1,0 < n1 < 27) 

01 indirect (disp = 0, 1, IRO, IR1) 
10 register (Rn1,0 <n1 s 27) 

11 indirect (disp = 0, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
register (Rn2, 0 < n2 < 27) 

1 register (Rn2,0 < n2 < 27) 
10 indirect (disp = O, 1, IRO, IR1) 
11 indirect (disp = 0, 1, IRO, IR1) 


24 23 1615 87 0 


The bitwise logical-AND between the src7 and src2 operands is formed, 
but is not loaded into any register. This allows for nondestructive compares. 
The src7 and sre2 operands are assumed to be unsigned integers. Although 
this instruction has only two operands, it is designated as a three operand 
instruction since operands are specified in the three operand format. 


1 
N MSB of the output. 


Zz 1 if a zero output is generated, 0 otherwise. 
V 0 

C Unaffected. 

UF 0O 


LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 
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TSTB3 \ Test Bit Fields, 3-Operand 


Example TSTB3 *AR5--(IRO),*+ARO(1) 


Before Instruction: 


ARD5 = 809885h 

IRO = 80h 

ARO = 80992Ch 

Data at 809885h = 898h = 2200 

Data at 80992Dh = 767h = 1895 

LUF LV UF NZVC=0000000 


After Instruction: 


AR5 = 809805h 

IRO = 80h 

ARO = 80992Ch 

Data at 809885h = 898h = 2200 

Data at 80992Dh = 767h = 1895 

LUF LV UF NZVC=0000100 


Example TSTB3 R4,*AR6--(IRO) 


Before Instruction: 


R4 = OFBC4h 

AR6 = 8099F8h 

{RO = 8h 

Data at 8099F8h = 1568h. 

LUF LV UF NZVC=0 000000 


After Instruction: 


R4 = OFBC4h 
AR6 = 8099FOh 
IRO = 8h 


Data at 8099F8h = 1568h 
LUF LV UF NZVC=0 000000 
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Bitwise Exclusive-OR XOR 


Syntax 
Operation 


Operands 


Encoding 
31 


XOR <sre>,<dst> 
dst XOR src > dst 


src general addressing modes (G): 
00 register (Rn, O <n < 27) 
01. direct 
10. indirect 
11 immediate 


dst register (Rn, O <n s 27) 


24 23 1615 87 0 


proroifa] ae foe 


Description 


Cycles 
Status Bits 


Mode Bit 


Example 


The bitwise exclusive-OR of the src and dst operands is loaded into the 
dst register. The dst and src operands are assumed to be unsigned integers. 


1 


N MSB of the output. 

Z 1 if a zero output is generated, 0 otherwise. 
V 0 

C Unaffected. 

UF 0O 


LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


XOR R1,R2 


Before Instruction: 


R1 = OFFA32h 
R2 = OFF5Cth 
LUF LV UF NZVC=0000000 


After Instruction: 


R1 = OFF3A2h 
R2 = OOOFF3h 
LUF LV UF NZVC=0000000 
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XOR3 


Syntax 
Operation 


Operands 


Encoding 
31 


Bitwise Exclusive-OR, 3-Operand 


XOR3 <src2>,<srce1>,<dst> 
src? XOR src2 > dst 


src? three-operand addressing modes (T): 
register (Rn1,0 < n1 < 27) 

01. indirect (disp = O, 1, IRO, IR1) 
10 register (Rn1,0 < ni < 27) 

11 indirect (disp = 0, 1, IRO, IR1) 


src2 three-operand addressing modes (T): 
00 register (Rn2,0 < n2 < 27) 
01 register (Rn2,0 < n2 < 27) 
1Q indirect (disp = 0,1, IRO, IR1) 
11 indirect (disp = 0, 1, IRO, IR1) 


dst register (Rn,O <n < 27) 


24 23 1615 87 0 


Description 


Cycles 
Status Bits 


Mode Bit 
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The bitwise exclusive-OR between the src7 and src2 operands is loaded 
into the dst register. The src7, src2, and dst operands are assumed to be 
unsigned integers. 


1 


N MSB of the output. 
2 1 if a zero output is generated, 0 otherwise. 


V 0 
Cc Unaffected. 
UF 0O 


LV Unaffected. 
LUF Unaffected. 


OVM Operation not affected by OVM. 


Bitwise Exclusive-OR, 3-Operand XOR3 


Example XOR3 *AR3++(IRO),R7,R4 


Before Instruction: 


AR3 = 809800h 

IRO = 10h 

R7 = OFFFFh 

R4 = Oh 

Data at 809800h = 5AC3h 

LUF LV UF NZVC=0000000 


After Instruction: 
AR3 = 80980Fh 


IRO = 10h 
R7 = OFFFFh 
R4 = 0A53Ch 


Data at 809800h = 5AC3h 
LUF LV UF NZVC=0000000 


Example XOR3 R5,¥*-AR1(1),R1 


Before Instruction: 


R5 = OFFA32h 

AR1 = 809826h 

R1 = Oh 

Data at 809825h = OFF5Cth 

LUF LV UF NZVC=0000000 


After Instruction: 


R5 = OFFA32h 
AR1 = 809826h 
R1 = OOOF33h 


Data at 809825h = OFF5Cth 
LUF LV UF NZVC=0000000 
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XOR3]||STI Parallel XOR3 and STI 


Syntax XOR3 <src2>,<srce1>,<dst?1> 
| STL <src3>,<dst2> 
Operation src? XOR src2 > dst 
\| src3 > dst2 | 
Operands src? register (Rn1,0 < n1 < 7) 


src2_ indirect (disp = 0, 1, IRO, IR1) 
dst7 register (Rn2, < n2 < 7) 
src3 register (Rn3, < n3 < 7) 
dst2 indirectO(disp = O, 1, IRO, IR1) 


Encoding 
31 24 23 1615 87 0 


tt: _dett | srot | ores | ae | roe 


Description _ A bitwise exclusive-XOR and an integer store are performed in parallel. All 
registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STI) reads from a 
register and the operation being performed in parallel (XOR3) writes to the 
same register, then STI accepts as input the contents of the register before 
it is modified by the XOR3. 


If src2 and dst2 point to the same location, src2 is read before the write to 


dst2. 

Cycles 1 

Status Bits N MSB of the output. 
Zz 1 if a zero output is generated, 0 otherwise. 
V 0 


C Unaffected. 
UF 0O 

LV Unaffected. 
LUF Unaffected. 


Mode Bit OVM Operation not affected by OVM. 
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Parallel XOR3 and STI XOR3||STI 


Example XOR3 *AR1++,R3,R3 
|| STI R6,*-AR2(IRO) 


Before Instruction: 


AR1 = 80987Eh 

R3 = 85h 

R6 = ODCh = 220 

AR2 = 8098B4h 

IRO = 8h 

Data at 80987Eh = 85h 

Data at 8098ACh = Oh 

LUF LV UFNZVC=0000000 


After Instruction: 


AR1 = 80987Fh 

R3 = Oh 

R6 = ODCh = 220 

AR2 = 8098B4h 

IRO = 8h 

Data at 80987Eh = 85h 

Data at 8098ACh = ODCh = 220 

LUF LV UF NZV C=0000000 
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Assembly Language Instructions - Individual Instructions 
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Section 12 


Software Applications 


The TMS320C30 is a very powerful digital signal processor with architecture 
and instruction set designed to make easy system solutions to DSP problems. 
There are instructions specifically designed for efficient implementations of 
DSP algorithms, but also there are general-purpose instructions that make the 
device suitable for more general tasks, like any microprocessor. The floating 
point and integer arithmetic supported by the device permits the designer to 
concentrate on the algorithm with minimal concerns about scaling, dynamic 
range, and overflows. 


The purpose of this section is to explain how to use the instruction set, the 
architecture, and the interface of the TMS320C30 processor. This is done by 
presenting coding examples for very frequently used applications, and by dis- 
cussing more involved examples and applications. In all cases, besides ex- 
plaining the principles involved in the application, the corresponding 
assembly-language code is given for instructional purposes and for immediate 
use. Whenever the detailed explanation of the underlying theory is too ex- 
tensive to be included in this manual, appropriate references are given for 
further information. 


Major topics discussed in this section are listed below. 


@ Processor Initialization (Section 12.1 on page 12-3) 


@ elogiam Control! (Section 12.2 on page 12-7) 
Subroutine calls 
= Software stack 
= Interrupt handling 
7 Delayed branches 
= Repeat modes 
= Computed GOTO’s 


® Logical and Arithmetic Operations (Section 12.3 on page 12-20) 
Bit manipulation 
= Block moves 
= Bit-reversed addressing 
= Division 
= Square root 
= Extended-precision arithmetic 
= IEEE <==> C30 floating point conversions 


@ Application-oriented Operations (Section 12.4 on page 12-45) 
= Companding (A-law, yu-law) 
= FIR / IIR filters (fixed and adaptive) 
= Matrix math 
7 FFT 


Software Applications 


_ Lattice filters 


@ Programming Tips (Section 12.5 on page 12-87) 
- C-callable Routines 
_ Code Optimization Check-list 
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Software Applications - Processor Initialization 


12.1 Processor Initialization 


Prior to the execution of a digital signal processing algorithm, it is necessary 
to initialize the processor. Generally, initialization takes place any time the 
processor is reset. 


When reset is activated by applying a low level to the RESET input for several 
cycles, the TMS320C30 terminates execution and puts the reset vector (i.e., 
the contents of memory location QO) in the program counter. The reset vector 
normally contains the address of the system initialization routine. The hard- 
ware reset also initializes various registers and status bits. 


After reset, the processor should be initialized to meet the requirements of the 
system. Instructions should be executed that set up operational modes, me- 
mory pointers, interrupts, and the remaining functions needed to meet system 
requirements. 


To configure the processor at reset, the following internal functions should 
be initialized: 


6 Memory-mapped registers 


@ Interrupt structure 


Example 12-1 shows coding for initializing the TMS320C30 to the following 
machine state, in addition to the initialization performed during the hardware 
reset (for conditions after hardware reset, see section 13): 


® All interrupts are enabled. 
® The overflow mode is disabled. 
@ The data memory page pointer is set to zero. 


@ The internal memory is filled with zeros. 


Note that all constants larger than 16 bits should be placed in memory and 
accessed through direct or indirect addressing. 
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Software Applications - Processor Initialization 


Example 12-1. TMS320C30 Processor Initialization 


* 


* TITL 
* 
-global 
.global 
-global 
-global 
-global 
.global 
.-global 
* 
* 
* 
* 
* 
* 
* LOCATION O. 
* 
sect 
RESET word 
* 
INTO .word 
INTL .word 
INT2 .word 
INT3 .word 
* 
XINTO -word 
RINTO -word 
XINTO .word 
RINTO .word 
TINTO -word 
TINT1 .word 
DINT .word 
.space 
TRAPO .word 
TRAP1 -word 
TRAP2 -word 
.space 


* 


"PROCESSOR INITIALIZATION EXAMPLE' 


RESET, INIT,BEGIN 
INTO,INT1,INT2,INT3 
ISRO,ISR1,ISR2,1ISR3 


DINT,DMA 


TINTO,TINT1,XINTO,RINTO,XINT1,RINT1 
TIMEO, TIME1L,XMTO,RCVO,XMT1,RCV1 
TRAPO,TRAP1,TRAP2,TRPO,TRP1,TRP2 


" init LA) 
INIT 


ISRO 
ISR1 
ISR2 
ISR3 


XMTO 
RCVO 
XMT1 
RCV1 
TIMEO 
TIME1L 
DMA 
20 
TRPO 
TRPi 
TRP 
29 
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RESET AND INTERRUPT VECTOR SPECIFICATION. 


Named section 
RS- loads addr 


INTO- loads ad 
INT1- loads ad 
INT2- loads ad 
INT3- loads ad 


Serial port 0 
Serial port O 
Serial port 1 
Serial port 1 
Timer O interr 
Timer 1 interr 
DMA interrupt 
Reserved space 


PROCESSOR INITIALIZATION FOR THE TMS320C30. 


THIS 


ARRANGEMENT ASSUMES THAT DURING LINKING, THE FOLLOWING 
TEXT SEGMENT WILL BE PLACED TO START AT MEMORY 


ess INIT to PC 


dress INTO to PC 
dress INT1 to PC 
dress INT2 to PC 
dress INT3 to PC 


transmit processing 
receive processing 

transmit processing 
receive processing 

upt processing 

upt processing 


Trap O vector processing begins 
Trap 1 vector processing begins 
Trap 2 vector processing begins 


Leave space fo 


r the other 29 traps 


* IN THIS SECTION, CONSTANTS THAT CANNOT BE REPRESENTED 
* IN THE SHORT FORMAT ARE INITIALIZED. 


MASK 
BLKO 
BLKO 
STCK 
CTRL 
DMACTL 
TIMOCTL 
TIMICTL 
SERGLOBO 
SERPRTXO 
SERPRTRO 
SERTIMO 
SERGLOB1 
SERPRTX1 
SERPRTR1 
SERTIM1 
PARINT 
IOINT 
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data 
.word 
-word 
word 
.word 
.word 
.word 
.word 
.word 
.word 
-word 
-word 
-word 
-word 
-word 
-word 
.word 
.word 
.word 


OFFFFFFFFH 
O0809800H 
O809CO0H 
O809FO0H 
O808000H 
OOOOQO000H 
OOOQOOOO00H 
OOOO0000H 
OOOQO000H 
OOQ0000H 
OOQO000H 
OOOO0000H 
OOQ0000H 
QOOOQO000H 
OOOOQO000H 
OOOOO00H 
OOO0000H 
OOOO0000H 
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Beginning addr 
Beginning addr 
Beginning of s 
Pointer for pe 
Initialization 
Initialization 
Initialization 
Init serial 
Init serial 
Init serial 
Init serial 
Init serial 
Inve serial 
Init serial 
Init serial 
Init parallel 

Init I/O inter 


ess of RAM block O 
ess of RAM block 1 
tack 
ripheral-bus memory map 
for DMA control (0) 
of timer O control (32) 
of timer 1 control (48) 
QO glbl control (64) 
O xmt port control (66) 
O rev port control (67) 
O timer control (68) 
1 glbl control (80) 
1 xmt port control (82) 
1 rcv port control (83) 
1 timer control (84) 
interface control (96) 
face control (100) 
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+ + + e HF HF HF HF HF HF FF HF HH HK 


text 


THE ADDRESS AT MEMORY LOCATION O DIRECTS EXECUTION TO BEGIN HERE 
FOR RESET PROCESSING THAT INITIALIZES THE PROCESSOR. 


WHEN RESET 


IS APPLIED, THE FOLLOWING REGISTERS ARE INITIALIZED TO ZERO: 


st - 
IE - 
LE <> 
IOF - 


CPU STATUS REGISTER 
CPU/DMA INTERRUPT ENABLE FLAGS 
CPU INTERRUPT FLAGS 
I/O FLAGS 


THE STATUS REGISTER HAS THE FOLLOWING ARRANGEMENT: 


BITS: 
FUNCTION: 


INIT LDP 


* 


ee + +e H+ He HF HF H— 


LDI 
LDI 


LDI 
LDI 
LDF 
RPTS 
STF 
STF 


THE PROCESSOR IS INITIALIZED. 
DEPENDENT PART OF THE SYSTEM (BOTH ON- AND OFF-CHIP SHOULD 


eo) 
VC 


31-14 23 2?) - tds LO; 39 8 7 6 & «4-3 2 
RESRV GIE CC CE CF RES RM OVM LUF LV UF N Z 
O,DP ; Point the DP register to page 0 
1800H,ST ; Clear and enable cache, 
@MASK,IE ; Unmask all interrupts 


@BLKO , ARO 
@BLK1,AR1 
0.0,RO0 

1023 

RO, *ARO++(1) 
RO, *AR1++(1) 


NOW BE INITIALIZED. 


FIRST, INITIALIZE THE CONTROL REGISTERS. 


es 
vy 
s 
rf 
4 
a 
e 
a7 
e 
a, 
7 


INTERNAL DATA MEMORY INITIALIZATION TO FLOATING POINT ZERO 


ARO points to block 0 
AR1 points to block 1 
Zero register RO 
Repeat 1024 times 


Zero out location in RAM block O and 
zero out location in RAM block 1. 


THE REMAINING APPLICATION- 


IN THIS EXAMPLE, 


EVERYTHING IS INITIALIZED TO ZERO SINCE THE ACTUAL INITIAL- 


IZATION IS APPLICATION DEPENDENT. 


LDI 


LDI 
STI 
LDI 
Sle 
LDI 
OLE 
LDI 
STI 
LDI 
STI 
LDI 
STI 
LDI 
sTl 
LDI 
STI 
LDI 
STI 
LDI 
SEL 


@CTRL, ARO 


@DMACTL, RO 

RO, *+ARO (0) 

@TIMOCTL,RO 

RO, *+ARO (32) 
@TIMICTL,RO 

RO, *+ARO (48) 
@SERGLOBO,RO 
RO, *+ARO (64) 
@SERPRTXO,RO 
RO, *+ARO (66) 
@SERPRTRO,RO 
RO, *+ARO (67) 
@SERTIMO,RO 
RO, *+ARO(68) 
@SERGLOBL, RO 
RO, *+ARO (80) 
@SERPRTX1,RO 
RO, *+AR0O (82) 
@SERPRTR1,RO 
RO, *+ARO (83) 


e 
7 


. 
g 


=e 


=e 


LOAD in ARO the pointer to control 


registers 


Init 
Init 
Init 
Init 
Ena 
Ine 
Init 
Lia 
init 


Init 


DMA control 


timer O control 


timer 1 control 


serial 
serial 
serial 
serial 
serial 
serial 


serial 


0 


0 


global control 
xmt control 
rcv control 
timer control 
global control 
xmt control 


rev control 


and disable OVM 
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LDI 
STI 
LDI 
STI 
LDI 
STI 


LDI 
OR 


BR 


end 


@SERTIM1,RO 
RO, *+ARO(84) 
@PARINT,RO 
RO, *+ARO (96) 
@IOINT,RO 

RO, *+ARO (100) 


@STCK,SP 
2000H,ST 


BEGIN 


o 
a 


e 
s 


. 
, 


Init serial timer control 
Init parallel interface control 
Init I/O interface control 


Initialize the stack pointer 
Global interrupt enable 


Branch to the beginning of application. 
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12.2 Program Control 


To facilitate the TMS320C30’s use in general-purpose, high-speed process- 
ing, a variety of instructions are provided to handle the: 


subroutine calls 
software stack 
interrupts 


zero-overhead branches 


single- and multiple-instruction loops without any overhead. 


This section describes how to use these features of the TMS320C30. 


12.2.1 Subroutines 


The TMS320C30 has a 24-bit program counter (PC) and a practically unlim- 
ited software stack. The CALL and CALLcond subroutine calls cause the stack 
pointer to increment, and store the contents of the next value of the PC 
counter on the stack. At the end of the subroutine, RETScond performs a 
conditional return. 


Example 12-2 illustrates the use of a subroutine to determine the dot product 
between two vectors. Given two vectors of length N, represented by the ar- 
rays a[O], a[1],..., a[N-1] and b[O], b[1],..., b[ N-1], the dot product is com- 
puted from the expression 


d = a[0] b[O] + a[1] b[1] + ... + a[N-1] b[N-1] 


Processing proceeds in the main routine to the point where the dot product 
is to be computed. It is assumed that the arguments of the subroutine have 
been appropriately initialized. At this point, a CALL is made to the subroutine, 
transferring control to that section of the program memory for execution, then 
returning to the calling routine via the RETS instruction when execution has 
completed. Note for this particular example, it would suffice to save the reg- 
ister R2. However, a larger number of registers are saved for demonstration 
purposes. The saved registers are stored on the system stack. This stack 
should be large enough to accommodate the maximum anticipated storage 
requirements. Besides this way of saving registers, any other method could 
be used equally well. 
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Example 12-2. Subroutine Call (Dot Product) 


TITL SUBROUTINE CALL (DOT PRODUCT) 


MAIN ROUTINE THAT CALLS THE SUBROUTINE 'DOT' TO COMPUTE THE 
DOT PRODUCT OF TWO VECTORS. 


e+ t+ + HF He 


LDI @b1k0,ARO ; ARO points to vector a 

LDI @b1lk1,AR1 ; AR1 points to vector b 

LDI N,RC > RC contains the number of elements 
CALL DOT 


SUBROUTINE D O T 


EQUATION: d = a(0) * b(O) + a(1) * b(1) + ... + a(N-1) * B(N-1) 


THE DOT PRODUCT OF a AND b IS PLACED IN REGISTER RO. N MUST 
BE GREATER THAN OR EQUAL TO 2. 


ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 


ome eee ee em wee ee oe oe ep a wee a een eee ee ee oe ee ee ome ee ee em oe ow ee owe a oe oe 


+ 
ARO {| ADDRESS OF a(0) 
| ADDRESS OF b(0) 
| LENGTH OF VECTORS (N) 


REGISTERS USED AS INPUT: ARO, AR1, RC 
REGISTER MODIFIED: RO 
REGISTER CONTAINING RESULT: RO 


eee eee Hs +H HF HH HAH HE HHH He HF HF KH F 


-global DOT 

DOT PUSHF R2 Use the stack to save R2'S 
PUSH R2 bottom 32 and top 32 bits 
PUSH ST Save status register 
PUSH ARO Save ARO 
PUSH AR1 Save AR1 
PUSH RC Save RC 


Initialize RO: 
a(O) * b(O) -> RO 
Initialize R2. 
Set RC = N-2 


MPYF3 *ARO,*AR1,RO 
LDF 0.0,R2 
SUBI 2,RC 


me “se We we NS Ne Se NS NS Ne 
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* 


DOT PRODUCT (1 <= i < N) 


RPTS RC ; Setup the repeat single. 
MPYF3 *++ARO(1),*++AR1(1),RO ; a(i) * b(i) -> RO 
ADDF3 RO,R2,R2 ; a(i-1l)*b(i-1) + R2 -> R2 
ADDF 3 RO,R2,RO ; a(N-1)*b(N-1) + R2 -> RO 


RETURN SEQUENCE 


end 


-end 


POP RC ; Restore RC 

POP AR1 > Restore AR1 

POP ARO >; Restore ARO 

POP ST ; Restore ST 

POPF R2 ; Restore top 32 bits of R2 
POP R2 ; Restore bottom 32 bits of R2 
RETS ; Return 


12.2.2 Software Stack 


The TMS320C30 has a software stack whose location is determined by the 
contents of the stack pointer register SP. The stack pointer increments from 
low to high values, and provisions should be made to accommodate the an- 
ticipated storage requirements. The stack can be used not only during the 
subroutine CALL and RETS, but also inside the subroutine as a place of tem- 
porary storage of the registers as shown in Example 12-2. SP always points 
to the last value pushed on the stack. 


The CALL and CALLcond instructions push the value of the program counter 
on the stack, as do the interrupt routines. Then, RETScond and RETIcond pop 
the stack and place the value in the program counter. The integer value of any 
register can be pushed on and popped off the stack using the PUSH and POP 
instructions. There are two additional instructions, PUSHF and POPF, for 
floating point numbers. These instructions can be used to pop and push 
floating point numbers to registers RO-R7. This feature is very useful if it is 
desired to save the extended precision registers (see Example 12-2). By using 
PUSH and PUSHF on the same register, the lower 32 and the upper 32 bits 
are saved. PUSH saves the lower 32; PUSHF, the upper 32. To recover this 
extended precision number, a POPF can be done followed by POP. It is im- 
portant to do the integer and floating-point PUSH and POP in the above or- 
der. POPF forces the last eight bits of the extended-precision registers to zero. 


The stack pointer (SP) can be both read from, and written to. Multiple stacks 
for different program segments may be easily created. SP is not initialized by 
the hardware during reset. It is therefore important to remember to initialize 
its value so that SP points to a predetermined memory location. This avoids 
the problem of SP attempting to write into ROM or over other useful data. 
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12.2.3 Interrupt Service Routines 


Interrupts on the TMS320C30 are prioritized and vectored. When an interrupt 
occurs, the corresponding flag is set in the Interrupt Flag Register IF. If the 
corresponding bit in the Interrupt Enable Register JE is set, and interrupts are 
enabled by having the GIE bit in the status register set to 1, interrupt proc- 
essing begins. The interrupt flag register can also be written to. This enables 
the user to force an interrupt by software, or to clear interrupts without proc- 
essing them. 


The Interrupt Flag Register IF can be read, and action taken based on whether 
the interrupt has occurred. This is true even when the interrupt is disabled. 
This can be useful when an interrupt-driven interface is not implemented. Ex- 
ample 12-3 shows the case where a subroutine is called when interrupt 1 has 
not occurred. 


Example 12-3. Use of Interrupts for Software Polling 


* TITL INTERRUPT POLLING 


TSTB 2,1F ; Test if interrupt 1 has occurred 
CALLZ SUBROUTINE ; If not, call subroutine 


When interrupt processing begins, the program counter is pushed on the 
stack, and the interrupt vector is loaded in the program counter. Interrupts are 
then disabled by setting the GIE=0, and the program continues from the ad- 
dress loaded in the program counter. Since all interrupts are disabled, inter- 
rupt processing may proceed without further interruption, unless the interrupt 
service routine re-enables interrupts. 


Except for very simple interrupt service routines, it is important to assure that 
the processor context is saved during execution of this routine. The context 
must be saved before executing the routine itself, and restored after the routine 
is finished. The procedure is called context switching. Context switching is 
also useful for subroutine calls, especially when extensive use is made of the 
auxiliary and the extended precision registers. Code examples of context 
switching and an interrupt service routine are provided in this section. 
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12.2.3.1 Context Switching 


Context switching is commonly required when processing a subroutine call 
or interrupt. It may be quite extensive or simple, depending on system re- 
quirements. On the TMS320C30, the program counter is automatically 
pushed on the stack. If there is any important information in the other 
TMS320C30 registers, such as the status, auxiliary or extended-precision re- 
gisters, these must be saved by special commands. 


Examples 12-4 and 12-5 show saving and restoring of the TMS320C30 state. 
In both examples, the stack is used for saving the registers, and it expands 
towards higher addresses. If it is not desirable to use the stack pointed at by 
SP, a separate stack can be created using an auxiliary register as the stack 
pointer. The registers saved are: 


Extended-precision registers RO through R7 
Auxiliary registers ARO through AR7 

Data page pointer DP 

Index registers IRO and IR1 

Block size register BK 

Status register ST 

Interrupt-related registers [IE and IF 


1/O flag IOF 


Repeat-related registers RS, RE, and RC 
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Example 12-4. Context Save For The TMS320C30 


* TITL CONTEXT-SAVE FOR THE TMS320C30 


* 
* 


-global SAVE 


* CONTEXT SAVE ON SUBROUTINE CALL OR INTERRUPT. 


* 
SAVE: 
* 


* SAVE THE EXTENDED PRECISION 


* 


REGISTERS 


PUSH RO ; Save the lower 32 bits of RO 
PUSHF RO ; and the upper 32 bits 
PUSH Rl ; Save the lower 32 bits of Rl 
PUSHF Rl ; and the upper 32 bits 
PUSH R2 ; Save the lower 32 bits of R2 
PUSHF R2 ; and the upper 32 bits 
PUSH R3 ; Save the lower 32 bits of R3 
PUSHF R3 : and the upper 32 bits 
PUSH R4 ; Save the lower 32 bits of R4 
PUSHF R4 ; and the upper 32 bits 
PUSH R5 ; Save the lower 32 bits of R5 
PUSHF R5 : and the upper 32 bits 
PUSH R6 ; Save the lower 32 bits of R6 
PUSHF R6 ; and the upper 32 bits 
PUSH R7 ; Save the lower 32 bits of R7 
PUSHF R7 ; and the upper 32 bits 

* 

* SAVE THE AUXILIARY REGISTERS 

* 
PUSH ARO ; Save ARO 
PUSH AR1 ; Save ARI 
PUSH AR2 ; Save AR2 
PUSH AR3 ; Save AR3 
PUSH AR4 ; Save AR4 
PUSH AR5 ; save ARS 
PUSH AR6 ; Save AR6 
PUSH AR7 ; Save AR7 

* 

* SAVE THE REST REGISTERS FROM THE REGISTER FILE 

* 
PUSH DP ; Save data page pointer 
PUSH IRO ; Save index register IRO 
PUSH IR1 ; Save index register IR1 
PUSH BK ; Save block-size register 
PUSH ST ; Save status register 
PUSH IE ; Save interrupt enable register 
PUSH IF ; Save interrupt flag register 
PUSH IOF ; Save I/O flag register 
PUSH RS ; Save repeat start address 
PUSH RE ; Save repeat end address 
PUSH RC ; save repeat counter 


* SAVE IS COMPLETE 
* 
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Example 12-5. Context-Restore For The TMS320C30 


te + % 


* 


TITL CONTEXT-RESTORE FOR THE TMS320C30 


-GLOBAL RESTR 


CONTEXT RESTORE AT THE END OF A SUBROUTINE CALL OR INTERRUPT. 


RESTR: 
* 


* 
* 


* 


* 


RESTORE THE REST REGISTERS FROM THE REGISTER FILE 


POP 
POP 
POP 
POP 
POP 
POP 
POP 
POP 
POP 
POP 
POP 


us wo Te we WS Ge Ns We Ns Re We 


Restore 
Restore 
Restore 
Restore 
Restore 
Restore 
Restore 
Restore 
Restore 
Restore 
Restore 


RESTORE THE AUXILIARY REGISTERS 


POP 
POP 
POP 
POP 
POP 
POP 
POP 
POP 


AR7 
AR6 
AR5 
AR4 
AR3 
AR2 
AR1 
ARO 


ue =e =e =e ue we =e MO 


Restore 
Restore 
Restore 
Restore 
Restore 
Restore 
Restore 
Restore 


repeat counter 
repeat end address 
repeat start address 
I/O flag register 


interrupt flag register 
interrupt enable register 


Status register 
block-size register 
index register IR1 
index register IRO 
data page pointer 


AR7 
AR6 
AR5 
AR4 
AR3 
AR2 
AR1 
ARO 


RESTORE THE EXTENDED PRECISION REGISTERS 


POPF 
POP 
POPF 
POP 
POPF 
POP 
POPF 
POP 
POPF 
POP 
POPF 
POP 
POPF 
POP 
POPF 
POP 


RESTORE IS COMPLETE 


a tt i tn tt tT 


Restore 
the 
Restore 
the 
Restore 
the 
Restore 
the 
Restore 
the 
Restore 


the upper 32 bits and 
lower 32 bits of R7 
the upper 32 bits and 
lower 32 bits of R6 
the upper 32 bits and 
lower 32 bits of R5 
the upper 32 bits and 
lower 32 bits of R4 
the upper 32 bits and 
lower 32 bits of R3 
the upper 32 bits and 


the lower 32 bits of R2 


Restore 
the 
Restore 
The 


the upper 32 bits and 
lower 32 bits of Rl 
the upper 32 bits and 
lower 32 bits of RO 
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12.2.3.2 Interrupt Priority 


Interrupts on the TMS320C30 are automatically prioritized. This allows in- 
terrupts that occur simultaneously to be serviced in a predefined order. Infre- 
quent, but lengthy, interrupt service routines may need to be interrupted by 
more frequently occurring interrupts. In Example 12-6, the interrupt service 
routine for INT2 temporarily modifies the interrupt enable register IE, to permit 
interrupt processing when an interrupt to INTO (but no other interrupt) occurs. 
When the routine has finished processing, the register IE is restored to its ori- 
ginal state. Notice that the RETI instruction not only pops the next program 
counter address from the stack, but also sets the GIE bit of the status register. 
This enables all interrupts which have their interrupt-enable bit set. 


Example 12-6. Interrupt Service Routine 


* TITL INTERRUPT SERVICE ROUTINE 


~ -global ISR2 
ENABLE .set 2000h 
MASK -set 1 


* 


* INTERRUPT PROCESSING FOR EXTERNAL INTERRUPT INT2- 
* 
ISR2: 
PUSH ST 
PUSH DP 


; Save status register 

; Save data page pointer 

PUSH IE ; Save interrupt enable register 
PUSH RO ; Save lower 32 bits and 

PUSHF RO : upper 32 bits of RO 

PUSH R1 3; Save lower 32 bits and 

PUSHF R1 ; upper 32 bits of Rl 
LDI MASK ,1IE ; Unmask only INTO 
OR ENABLE, ST ; Enable all interrupts 


* MAIN PROCESSING SECTION FOR ISR2 


XOR ENABLE, ST 


; Disable all interrupts 
POPF R1 ; Restore upper 32 bits and 
POP Rl : lower 32 bits of Rl 
POPF RO ; Restore upper 32 bits and 
POP RO ; lower 32 bits of RO 
POP IE ; Restore interrupt enable register 
POP DP ; Restore data page register 


Restore status register 


POP ST 
; 
RETI 


; Return and enable interrupts 
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12.2.4 Delayed branches 


The TMS320C30 offers the capability of single-cycle branching through the 
use of the delayed branches. The delay branches operate like regular branches 
but do not flush the pipeline. Instead, the three instructions following a de- 
layed branch are also executed. As discussed in the section on program-flow 
control, the only limitation is that the three instructions following a delayed 
branch cannot be a: 


@ Branch (standard or delayed) 
@ Call to a subroutine 

®@ Return from a subroutine 

@ Return from an interrupt 

@ Repeat instructions 

® A TRAP instruction 


e An !IDLE instruction 


Conditional delayed branches use the conditions that exist at the end of the 
instruction immediately preceding the delayed branch. Sometimes, a branch 
is necessary in the flow of a program, but less than three instructions can be 
placed after a delayed branch. For faster execution, it is still advantageous to 
use a delayed branch. This is shown in Example 12-7, with NOP’s taking the 
place of the unused instructions. The trade-off is more instruction words for 
less execution time. 


Example 12-7. Delayed Branch Execution 


* TITL DELAYED BRANCH EXECUTION 


SKIP 


*+AR1(5),R2 ; Load contents of memory to R2 
BGED SKIP ; If loaded number >=0, branch (delayed) 
LDFN R2,R1 ; If loaded number <0, load it to Rl 
SUBF 3.0,R1 ; Subtract 3 from Rl 
NOP ; Dummy operation to complete delayed 

; branch 

MPYF 1.5,R1 ; Continue here if loaded number <0 
LDF R1,R3 ; Continue here if loaded number >=0 
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12.2.5 Repeat Modes 


The TMS320C30 supports looping without any overhead. For that purpose, 
there are two instructions: RPTB repeats a block of code, and RPTS repeats a 
single instruction. There are three control registers RS (repeat start address), 
RE (repeat end address), and RC (repeat counter). These contain the param- 
eters that specify loop execution (refer to Section 7.1 for a complete de- 
scription of RPTB and RPTS). RS and RE are automatically set from the code, 
while RC has to be set by the user, as shown in the examples below. 


12.2.5.1 Block Repeat 


Example 12-8 shows an application of the block repeat construct. In this ex- 
ample, an array of 64 elements is “flipped over” by exchanging the elements 
that are equidistant from the end of the array. In other words, if the original 
array Is: 


a(1), a(2),..., a(31), a(32),..., a(64); 

the final array after the rearrangement will be: 

a(64), a(63),..., a(32), a(31),..., a(1). 

Note that since the exchange operation is done on two elements at the same 
time, there is a need of 32 operations. The repeat counter RC is initialized to 


31. In general, if RC contains the number N, the loop will be executed N+1 
times. The loop is defined by the RPTB instruction and the EXCH label. 


Example 12-8. Loop Using Block Repeat 


* 
* 
* 
* 
* 


TITL LOOP USING BLOCK REPEAT 


THIS CODE SEGMENT EXCHANGES THE VALUES OF ARRAY ELEMENTS THAT ARE 
SYMMETRIC AROUND THE MIDDLE OF THE ARRAY. 


LDI @ADDR,ARO ; ARO points to the beginning of the array 
LDI ARO,AR1 
ADDI 63,AR1 ; AR1 points to the end of the 
: 64-element array 
LDI 31,RC ; Initialize repeat counter 
RPTB EXCH ; Repeat RC+1l times between here and 
EXCH 
LDI *ARO,RO ; Load one memory element in RO, 
LDI *AR1,R1 ; and the other in Rl 
STI R1,*ARO++(1) ; Then, exchange their locations 
RO, *AR1--(1) 


sTI 
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rule is, since the program counter is modified at the end of the loop according 
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to modify the repeat counter or the program counter at the end of the loop in 
a different way. 


In principle, it is possible to nest repeat blocks. However, there is only one 
set of control registers RS, RE, and RC. It is therefore necessary to save these 
registers before entering an inside loop. It may be more economical to im- 
plement a nested loop by the more traditional method of a register serving as 
a counter and then using a delayed branch rather than applying the above 
approach. 


Example 12-9 shows another example of using the block repeat to find a 
maximum of 147 numbers. 


Example 12-9. Use of Block Repeat to Find a Maximum 


* 
* 

cams bt Gs & 
* 

* THIS 
* 

* 

LOOP 


USE OF BLOCK REPEAT TO FIND A MAXIMUM 
ROUTINE FINDS THE MAXIMUM OF N=147 NUMBERS 
LDI 146,RC ; Initialize repeat counter to 147-1 
LDI @ADDR,ARO ; ARO points to the beginning of the array 
LDF *ARO++(1),RO ; Initialize MAX to the first value 
RPTB LOOP 
CMPF *ARO++(1),RO ; Compare number to the maximum 


LDFLT *-ARO(1),RO ; If greater, this is a new maximum 


12.2.5.2 Single-Instruction Repeat 


The single instruction repeat operates using the control registers RS, RE, and 
RC, in the same way as the block repeat. The advantage over the block repeat 
is that the instruction is fetched only once, and then the buses are available 
for moving operands. One difference to note is that the single-instruction re- 
peat construct is not interruptible, while block repeat is interruptible. 


Example 12-10 shows an application of the repeat-single construct. In this 
example, the sum of the products of two arrays is computed. The arrays are 
not necessarily different. If the arrays are a(i) and b(i), each of length N=512, 
register RO will contain, after computation, the quantity: 


a(1)b(1)+a(2)b(2) +...+a(N)b(N). 


The value of the repeat counter (RC) is specified to be 511 in the instruction. 
lf RC contains the number N, the loop will be executed N+1 times. 
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Example 12-10. 


* TITL LOOP USING SINGLE REPEAT 
* 
* N 
* THIS CODE SEGMENT COMPUTES > a(i)b(i) 
. i=1 
k 
LDI @ADDR1,ARO ; ARO points to array a(i) 
LDI @ADDR2,AR1 ; AR1 points to array b(i) 
* 
LDF 0.0,R0 ; Initialize RO 
* 
MPYF3 *ARO++(1),*AR1++(1),R1 
5a ; Compute first product 
RPTS SLL ; Repeat 512 times 
* 
MPYF3 *ARO++(1),*AR1++(1),R1,RO ; Compute next product 
| | ADDF 3 R1,RO,RO ; and accumulate the previous one 
* 


ADDF R1,R0 : One final addition 


12.2.6 Computed GOTO’s 


Occasionally, it is convenient to select during runtime, and not during assem- 
bly, what subroutine needs to be executed. The TMS320C30 offers the ca- 
pability of a computed GOTO that can satisfy such a need. The computed 
GOTO is implemented using the CALLcond instruction in the register ad- 
dressing mode. This instruction uses the contents of the register as the ad- 
dress of the call. Example 12-11 shows the case of a task controller. 
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Example 12-11. Computed GOTO 


* TITL COMPUTED GOTO 
* 
* TASK CONTROLLER 
* 
* THIS MAIN ROUTINE CONTROLS THE ORDER OF TASK EXECUTION(6 TASKS 
* IN THE PRESENT EXAMPLE). TASKO THROUGH TASK5 ARE THE NAMES OF 
* SUBROUTINES TO BE CALLED. THEY ARE EXECUTED IN ORDER, TASKO, 
* TASK1, . . .TASKS. WHEN AN INTERRUPT OCCURS, THE INTERRUPT 
* SERVICE ROUTINE IS EXECUTED, AND THE PROCESSOR CONTIMUES 
* WITH THE INSTRUCTION FOLLOWING THE IDLE INSTRUCTION. THIS 
* ROUTINE SELECTS THE TASK APPROPRIATE FOR THE CURRENT CYCLE, 
* CALLS THE TASK AS A SUBROUTINE, AND BRANCHES BACK TO THE IDLE 
* TO WAIT FOR THE NEXT SAMPLE INTERRUPT WHEN THE SCHEDULED TASK 
* HAS COMPLETED EXECUTION. RO HOLDS THE OFFSET FROM THE BASE 
* ADDRESS OF THE TASK TO BE EXECUTED. 
* 
* 
LDI 5,RO ; Initialize RO 
LDI @ADDR,ARI1 ; AR1 holds the base address of the table 
WAIT IDLE ; Wait for the next interrupt 
ADDI *AR1,RO,RI1 ; Add the base address to the table 
- : Entry number 
SUBI 1,R0 ; Decrement RO 
LDILT 5,20 ; If RO<O, reinitialize it to 5 
CALLU R1 ; Execute appropriate task 
BR WALT 


* 


TSKSEQ .word TASK5 


Address of TASK5 
Address of TASK4 
Address of TASK3 
Address of TASK2 
Address of TASK1 
Address of TASKO 


-word TASK4 
-word TASK3 
-word TASK2 
-word TASK1 
-word TASKO 


=e ‘we we Me Ne NE 


ADDR - WORD TSKSEQ 
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12.3 Logical and Arithmetic Operations 


The TMS320C30 instruction set supports both integer and floating point 
arithmetic and logical operations. The basic functions of such instructions can 
be combined to form more complex operations. This section examines exam- 
ples of such operations, such as: 


Bit manipulation 
Block moves 


Bit-reversed addressing 


S 
@ 
& 
e Integer and floating-point division 
@ Square root 

9 Extended precision arithmetic 

& 


Floating point format conversion between IEEE and TMS320C30 
formats. 


12.3.1 Bit Manipulation 


The instructions of the TMS320C30 for the usual logical operations, such as 
AND, OR, NOT, ANDN, and XOR, can be used together with the shift in- 
structions for bit manipulation. In addition to these instructions, there is a 
special instruction, TSTB, for testing bits. TSTB does the same operation as 
AND, but the result of the logical AND is not written anywhere and is only 
used to set the condition flags. Examples 12-12 and 12-13 demonstrate the 
use of the several instructions for bit manipulation and testing. 


Example 12-12. Use of TSTB for Software-Controlled Interrupt 


+ +e He He 
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TITL USE OF TSTB FOR SOFTWARE-CONTROLLED INTERRUPT 


IN THIS EXAMPLE, ALL INTERRUPTS HAVE BEEN DISABLED BY 
RESETTING THE GIE BIT OF THE STATUS REGISTER. WHEN AN 
INTERRUPT ARRIVES, IT IS STORED IN THE IF REGISTER. THE 
PRESENT EXAMPLE ACTIVATES THE INTERRUPT SERVICE ROUTINE INTR 
WHEN IT DETECTS THAT INT2- HAS OCCURRED. 


TSTB 4,1F ; Check if bit 2 of IF is set, 
CALLNZ INTR ; and, if so, call subroutine INTR 
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Example 12-13. Copy a Bit From One Location to Another 


TITL COPY A BIT FROM ONE LOCATION TO ANOTHER 


BIT I OF R1 NEEDS TO BE COPIED TO BIT J OF R2. 
ARO POINTS TO A LOCATION HOLDING I, AND IT IS ASSUMED THAT THE 
NEXT MEMORY LOCATION HOLDS THE VALUE J. 


+ + F FF HF HF 


+ + OF 


* 


R2 


+ + 


* (ARO+1) 


I 

} 
J 
4 


LDI 1,RO 


LSH *ARO,RO ; Shift 1 to align it with bit I 
TSTB R1,RO >; Test the I-th bit of R1 

BZD CONT ; If bit = 0, branch delayed 

LDI 1,RO0 

LSH *+ARO(1),RO ; Align 1 with J-th location 

ANDN RO,R2 >; If bit = 0, reset J-th bit of R2 
OR RO,R2 >; If bit = 1, set J-th bit of R2 


CONT . 
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12.3.2 Block Moves 


Since the TMS320C30 directly addresses a large amount of memory, blocks 
of data or program code can be stored off-chip in slow memories and then 
loaded on-chip for faster execution. Data can also be moved from on-chip to 
off-chip for storage or for multiprocessor data trasfers. 


Such data transfers can be accomplished very efficiently in parallel with CPU 
operations, using the DMA. The DMA operation is explained in detail in an 
earlier section of this manual. An alternative to DMA is to perform data 
transfers under program control using load and store instructions in a repeat 
mode. Example 12-14 shows the case where a block of 512 floating-point 
numbers are transferred from external memory to block 1 of the on-chip RAM. 


Example 12-14. Block Move Under Program Control 


* TITL BLOCK MOVE UNDER PROGRAM CONTROL 
* 


extern 
blockl 


word O01000H 
-word O809C00H 


LDI 


@extern,ARO ; Source address 
LDI @block1,AR1 ; Destination address 
LDF *ARO++,RO © ; Load the first number 
RPTS 510 ; Repeat following instruction 511 times 
LDF *ARO++,RO ; Load the next number, and... 
STF RO, *AR1++ ; store the previous one 
RO, *AR1 ; Store the last number 


STF 


12.3.3 Bit-Reversed Addressing 
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For an efficient implementation of Fast Fourier Transforms (FFT), the 
TMS320C30 offers the capability of bit-reversed addressing. If the data to 
be transformed is in the correct order, the final result of the FFT is scrambled 
(in bit-reversed order). To recover the frequency-domain data in the correct 
order, certain memory locations have to be swapped. The bit-reversed ad- 
dressing mode offers the alternative of not doing this swapping. The next time 
data needs to be accessed, the access is done in a bit-reversed manner rather 
than sequentially. 


In bit-reversed addressing, IRO holds a value equal to one-half the size of the 
FFT, if real and imaginary data are stored in separate arrays. During accessing, 
the auxiliary register is indexed by IRO, but with reverse carry propagation. 
Example 12-15 illustrates a 512-point complex FFT being moved from the 
place of computation (pointed at by ARO) to a location pointed at by AR1. 
In this example, real and imaginary parts XR(i) and XI(i) of the data are not 
stored in separate arrays, but they are interleaved XR(Q),X1(0),XR(1),X1(1).,..., 
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XR(N-1),XI(N-1). Because of this arrangement, the length of the array is 2N 


instead of N, and [RO is set to 512 instead of 256. 


Example 12-15. Bit-Reversed Addressing 


* 
* TITL BIT-REVERSED ADDRESSING 
* 
* THIS EXAMPLE MOVES THE RESULT OF THE 512-POINT FFT 
* COMPUTATION, POINTED AT BY ARO, TO A LOCATION POINTED AT 
* BY AR1. REAL AND IMAGINARY POINTS ARE ALTERNATING. 
LDI 512,IRO 
LDI 2,IR1 
LDI 511,RC ; Repeat 511+1 times 
LDF *+ARO(1),R1 ; Load first imaginary point 
RPTB LOOP 
* 
LDF *ARO++(IRO)B,RO ; Load real value (and point 
Pike ; to nex ocation) and store 
| STF R1,*+AR1(1) t t location) d st 
* 


LOOP LDF 
1 | STP 


; the imaginary value 
*+ARO(1),R1 ; Load next imaginary point and store 
RO,*AR1++(IR1) ; previous real value 


12.3.4 Integer and Floating-point Division 


Although division is not implemented as a single instruction in the 
TMS320C30, the instruction set provides the necessary capabilities for an ef- 
ficient division routine. Integer and floating-point division will be examined 
separately because different algorithms are used. 


712.3.4.1 Integer Division 


Division is implemented on the TMS320C30 by repeated subtractions using 
SUBC, a special conditional subtract instruction. Consider the case of a 32-bit 
positive dividend with i significant bits (and 32-i sign bits), and a 32-bit po- 
sitive divisor with j significant bits (and 32-j sign bits). The repetition of the 
SUBC command i-j+1 times produces a 32 bit result where the lower i-j+1 
bits are the quotient, and the upper 31 -i+j bits, the remainder of the division. 


SUBC implements binary division in the same manner as in long division. The 
divisor (assumed to be smaller than the dividend) is shifted left i-j times to be 
aligned with the dividend. Then, using SUBC, the shifted divisor is subtracted 
from the dividend. For each subtract that does not produce a negative answer, 
the dividend is replaced by the difference. It is then shifted to the left, and a 
one is put in the LSB. If the difference is negative, the dividend is simply 
shifted left by one. This operation is repeated i-j+1 times. 


As an example, consider the division of 33 by 5 using both long division and 
the SUBC method. In this case, i=6, j=3, and the SUBC operation is repeated 
6-3+1=4 times. 
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LONG DIVISION: 


| 000000000000000000000000000001 10 Quotient 
000000000000000000000000000001 01 000000000000000000000000001 00001 


SUBC METHOD: 
000000000000000000000000001 00001 
00000000000000000000000000101000 
Negative difference 


00000000000000000000000001 000010 
000000000000000000000000001 01000 


Peon er er cee er Ts 1010 


000000000000000000000000001 10101 
000000000000000000000000091 01000 


ee menor gee 101 


0000000000000000000000000001 1011 
000000000000000000000000001 01 000 


Negative difference 


| 000000000000000000000000001 10110 | 
Remainder Quot. 


When using the SUBC command, both the dividend and the divisor must be 
positive. Example 12-16 shows a realization of the integer division, where the 
sign of the quotient is properly handled. The last instruction before returning 
modifies the condition flag, in case subsequent operations depend on the sign 


of the result. 
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-101 


1101 
-101 


11 Remainder 


Dividend 
Divisor (aligned) 


(1st SUBC command) 


New Dividend + Quotient 
Divisor 
Difference (>0) (2nd SUBC command) 


New Dividend + Quotient 
Divisor 


Difference (>0) (3rd SUBC command) 
New Dividend + Quotient 
Divisor 


(4th SUBC command) 


Final Result 
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Example 12-16. Integer Division 


+ + 


TITL INTEGER DIVISION 


%* 


SUBROUTINE DIVI 


* 
* 
* INPUTS: SIGNED INTEGER DIVIDEND IN RO, 
* SIGNED INTEGER DIVISOR IN Rl. 
* 
* OUTPUT: RO/R1 into RO. 
* 
* REGISTERS USED: RO-R3, IRO, IR1 
* 
* OPERATION: 1. NORMALIZE DIVISOR WITH DIVIDEND 
* 2. REPEAT SUBC 
* 3. QUOTIENT IS IN LSBs OF RESULT 
* 
* CYCLES: 31-62 (DEPENDS ON AMOUNT OF NORMALIZATION) 
* 
-globl DIVI 
SIGN -set R2 
TEMPF -set R3 
TEMP -set TRO 
COUNT -set IRI 


* DIVI - SIGNED DIVISION 


DIVI: 
* 
* DETERMINE SIGN OF RESULT. GET ABSOLUTE VALUE OF OPERANDS. 
* 
XOR RO,R1,SIGN ; Get the sign 
ABST RO 
ABSI R1 
CMPI RO,R1 ; Divisor > dividend ? 
BHID ZERO ; If so, return O 


NORMALIZE OPERANDS. USE DIFFERENCE IN EXPONENTS AS SHIFT COUNT 
FOR DIVISOR, AND AS REPEAT COUNT FOR 'SUBC'. 


+ + t+ 


Normalize dividend 
PUSH as float 

POP as int 

Get dividend exponent 


FLOAT RO, TEMP 
PUSHF TEMPF 

POP COUNT 

LSH -24,COUNT 


=e we NS NE 


Normalize divisor 
PUSH as float 


FLOAT R1,TEMPF 
PUSHF TEMPF 


a 
POP TEMP ; POP as int 
LSH -24,TEMP ; Get divisor exponent 
SUBI TEMP , COUNT ; Get difference in exponents 
LSH COUNT,R1 ; Align divisor with dividend 
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* 


DO COUNT+1 SUBTRACT & SHIFTS. 


RPTS COUNT 
SUBC R1,RO 


* MASK OFF THE LOWER COUNT+1 BITS OF RO 
* 
SUBRI 31,COUNT ; Shift count is (32 - (COUNT+1)) 
LSH COUNT ,RO ; Shift left 
NEGI COUNT 
LSH COUNT, RO ; Shift right to get result 
* 
* CHECK SIGN AND NEGATE RESULT IF NECESSARY. 
* 
NEGI RO,R1 Negate result 
ASH -31,SIGN Check sign 


LDINZ R1,RO 
CMPI 0,RO 


If set, use negative result 
Set status from result 


ue “ee “SO NO 


RETS 


* 


* RETURN ZERO. 


* 


ZERO: 
LDI 


0,RO 


RETS 
.end 


If the dividend is less than the divisor and fractional division is desired, a di- 
vision can be performed after determining the desired accuracy of the quotient 
in bits. If the desired accuracy is k bits, start by shifting the dividend left by k 
positions. Then apply the algorithm described above, where i should now be 
replaced by i+k. It is assumed that i+k is less than 32. 


12.3.4.2 Computation of Floating-point Inverse and Division 
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This section presents a method of implementing floating-point division on the 
TMS320C30. Since the algorithm outlined here computes the inverse of a 
number v, to divide y/v, multiply y by the inverse of v. 


The computation of 1/v is based on the following iterative algorihm. At the 
i-th iteration, the estimate x[i] of 1/v is computed from v, and the previous 
estimate x[i-1] according to the formula: 


x[i] = x[i-1] * (2.0 - v * x[i-1]) 


To start the operation, an initial estimate x[O] is needed. If v=a*2®, a good 
initial estimate is: 


x[0] = 1.0 * 2-e-1 


Example 12-17 shows the implementation of this algorithm on _ the 
TMS320C30, where the iteration has been applied 5 times. The choice of the 
number of iterations was based on the desire to have maximum accuracy. The 
accuracy offered by the single-precision floating-point format is 
2-23=1.192E-7. If more accuracy is desired, more iterations can be used. If 
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less accuracy is acceptable, the execution speed of this implementation can 
be increased by reducing the number of iterations. 


This algorithm properly treats the boundary conditions, when the input num- 
ber is either zero or it has a very large value. When the input is zero, the ex- 
ponent e=-128. Then the calculation of x[{O] yields an exponent equal to 
-(-128)-1=127 and the algorithm will overflow and saturate. On the other 
hand, in the case of a very large number, e=127, the exponent of x[0] will be 
-127-1=-128. This will cause the algorithm to yield zero, which is a reason- 
able handling of that boundary condition. 
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Example 12-17. Inverse of a Floating-Point Number 
TITL INVERSE OF A FLOATING-POINT NUMBER 
SUBROUTINE INVF 
THE FLOATING-POINT NUMBER v IS STORED IN RO. AFTER THE 
COMPUTATION IS COMPLETED, 1/v IS ALSO STORED IN RO. 
TYPICAL CALLING SEQUENCE: 

LDF v, RO 

CALL INVF 


ARGUMENT ASSIGNMENTS : 
ARGUMENT | FUNCTION 


v = NUMBER TO FIND THE RECIPROCAL OF (UPON THE CALL) 
1/v (UPON THE RETURN) 


REGISTER USED AS INPUT: RO 
REGISTERS MODIFIED: RO, Rl, R2, R3 
REGISTER CONTAINING RESULT: RO 


CYCLES: 35 WORDS: 32 


* ee He Hee He HHH He He HH HHH He He HS HF SF HF HF HF HE F 


-global INVF 


* 


INVF : LDF RO,R3 ; v is saved for later. 
ABSF RO ; The algorithm uses v = |[v|. 


* EXTRACT THE EXPONENT OF v. 


PUSHF RO 
POP Rl 
ASH -24,R ; The 8 LSBs of R1 contain the exponent 
* : of v. 
* 
* x[0] FORMATION GIVEN THE EXPONENT OF v. 
* 
NEGI R1 
SUBI 1,R1 ; Now we have -e-1, the exponent of x[0O]. 
ASH 24,R1 
PUSH Rl 
POPF Rl > Now R1 = x[0] = 1.0 * 2**(-e-1). 
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* 
* 


t+ + + 


+ 


NOW THE ITERATIONS BEGIN. 
MPYF R1L,RO,R2 > R2 =v * x{0] 
SUBRF 2.0,R2 ; R2 = 2.0 - v * x[0O] 
MPYF R2,R1 > RI = x{1] = x{[O] * (2.0 - v * x{0)) 
MPYF R1,RO,R2 >; R2 =v * x[1] 
SUBRF 2.0,R2 > R2 = 2.0 -v * x[1] 
MPYF R2,R1 > Rl = x{2] = x[1] * (2.0 - v * x[1]) 
MPYF R1,RO,R2 ; R2 =v * x[2] 
SUBRF 2.0,R2 ; R2=2.0-v * x[2] 
MPYF R2,R1 > Rl = x[{3] = x[2] * (2.0 - v * x[2]) 
MPYF R1,RO,R2 >; R2 =v * x[3] 
SUBRF 2.0,R2 : R2 = 2.0 -v * x{3] 
MPYF R2,R1 > R1 = x[{4] = x[3] * (2.0 - v * x[3]) 
RND R1 ; This minimizes error in the LSBs. 
FOR THE LAST ITERATION WE USE THE FORMULATION: 
x{5] = (x[4] * (1.0 - (v * x[4]))) + x{4] 
MPYF R1,RO,R2 ORD See AS 10 01a St 
SUBRF 1.0,R2 > R2=1.0-v * x{4] = 0.0..01... => 0 
MPYF R1,R2 > R2 = x{4] * (1.0 - v * x[4]}) 
ADDF R2,R1 > R2 = x[5] = (x[4]*(1.0-(v*x[4])))+x[4] 
RND R1,RO ; Round since this is follow by a MPYF. 
NOW THE CASE OF v < O IS HANDLED. 
NEGF RO,R2 
LDF R3,R3 ; This sets condition flags. 
LDFN R2,RO > If v < O, then RO = -RO 
RETS 
END 
.end 


12.3.5 Square Root 


The implementation of the square root on the TMS320C30 is done by an it- 
erative algorithm very similar to the one used for the computation of the in- 
verse. This algorithm computes the inverse of the square root of a number v, 
1/SQRT(v). To derive SORT(v), multiply this result by v. Since in many ap- 
plications, division by the square root of a number is desirable, the output of 
the algorithm saves the effort to compute the inverse of the square root. 


At the i-th iteration, the estimate x[i] of 1/SQRT(v) is computed from v and 
the previous estimate x[i-1] according to the formula: 


xi] = x[i-1] * (1.5 - (w/2) * x[i-4] * xfi-1]) 


To start the operation, an initial estimate x[0] is needed. If v=a*2°, a good 
initial estimate is: 


x[0] = 1.0 * 2-e/2 
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Example 12-18 shows the implementation of this algorithm on_ the 
TMS320C30, where the iteration has been applied 5 times. The choice of the 
number of iterations was based on the desire to have maximum accuracy. If 
more accuracy is desired, more iterations can be used. If less accuracy is ac- 
ceptable, the execution speed of this implementation may be increased by re- 


ducing the number of iterations. 
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Example 12-18. Square Root of a Floating-Point Number 
TITL SQUARE ROOT OF A FLOATING-POINT NUMBER 


SUBROUTINE SORT 


THE FLOATING POINT NUMBER v IS STORED IN RO. AFTER THE 
COMPUTATION IS COMPLETED, SQRT(v) IS ALSO STORED IN RO. NOTE 
THAT THE ALGORITHM ACTUALLY COMPUTES 1/SQRT(v). 


TYPICAL CALLING SEQUENCE: 


LDF v, RO 
CALL SQRT 
ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 


(UPON THE CALL) 


+ 
RO | v = NUMBER TO FIND THE SQUARE ROOT OF 
| 
| SORT(v) (UPON THE RETURN) 


RO 
REGISTER USED AS INPUT: RO 
REGISTERS MODIFIED: RO, Rl, R2, R3 
REGISTER CONTAINING RESULT: RO 


CYCLES: 39 WORDS: 33 


+b bb He FHF FH HF FH HF HHH HHH SH HF HF HH HH 


-global SQRT 


* EXTRACT THE EXPONENT OF V. 
* 


SORT: LDF RO,R3 ; Save v 
RETSLE ; Return if number non-positive 
PUSHF RO 
POP Rl 
ASH =25,R1 ; The 8 LSBs of R1 contain 1/2 the exponent 
‘ ; of v. 
* 
* X[O] FORMATION GIVEN THE EXPONENT OF V. 
* 
NEGI R1 
ASH 24,R1 
PUSH R1 
POPF R1 ; Now R1 = x[0O] = 1.0 * 2**(-e/2). 
* 
* GENERATE V/2. 
* 
MPYF 0.5,R0 
* 
* NOW THE ITERATIONS BEGIN. 
* 
MPYF R1,R1,R2 ; R2 = x{O] * x[0O] 
MPYF RO,R2 ; R2 = (v/2) * x{0] * x[0] 
SUBRF 1.5,R2 ; R2= 1.5 - (v/2) * x[0O] * x[0] 
MPYF R2,R1 > R1 = x[{1] = x[0O] * 
* : CLES =. Gv/2) ¥ x10] *xb0)) 
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MPYF R1,R1,R2 > R2 = x{1] * x[1] 
MPYF RO,R2 ; R2 = (v/2) * x[{1] * x[1] 
SUBRF 1.5,R2 ; R2= 1.5 - (v/2) * x{1] * x[1] 
MPYF R2,R1 > R1 = xf{2] = x[1] * 
* ; (1.52 Cv 7/2) ex [Tye xt bt) 
MPYF R1,R1,R2 ; R2 = x{2] * x[2] 
MPYF RO,R2 ; R2 = (v/2) * x[2] * x[2] 
SUBRF 1.5,R2 ; R2= 1.5 - (v/2) * x[2] * x[2] 
MPYF R2,R1 ; R1 = x[3] = x[2] 
* ; * (1.5 = (v/2)*x[2]*x[2]) 
* 
MPYF R1,R1,R2 ; R2 = x{3] * x[3] 
MPYF RO,R2 ; R2 = (v/2) * x[3] * x[3] 
SUBRF 1.5,R2 ; R2= 1.5 - (v/2) * x[3] * x[3] 
MPYF R2,R1 >; Rl = x[{4] = x[3]} 
* ; * (1.5 - (v/2)*x[3]*x[3]) 
* 
MPYF R1,R1,R2 ; R2 = x[4] * x[4] 
MPYF RO,R2 ; R2 = (v/2) * x[4] * x[4] 
SUBRF 154 R2 ; R2= 1.5 - (v/2) * x[4] * x[4] 
MPYF R2,R1 >; R1 = x{5] = x[4] 
* ; * (1.5 - (v/2)*x[4]*x[4]) 
* 
* 
RND R1,RO ; Round 
* 
MPYF R3,RO ; Sqrt(v) from sqrt(v**(-1)) 
* 
RETS 
* 
* end 
* 
.end 


12.3.6 Extended-Precision Arithmetic 


The TMS320C30 offers 32 bits of precision for integer arithmetic, and 24 bits 
of precision in the mantissa for floating point arithmetic. For higher precision 
in floating-point operations, the eight extended-precision registers RO to R7 
contain eight more bits of accuracy. Since no co rable extension is avail- 
able for fixed-point arithmetic, this section discusses how~fixed-point double 
precision can be achieved using the capabilities of the processor. The tech- 
nique consists of performing the arithmetic by parts, similar to the way in 
which longhand arithmetic is done. 


The instruction set has operations ADDC (Add with Carry) and SUBB (Sub- 
tract with Borrow) which use the status carry bit for extended-precision 
arithmetic. The carry bit is affected by the arithmetic operations of the ALU, 
and the rotate and shift instructions. It can also be manipulated directly by 
setting the status register to certain values. For proper operation, the overflow 
mode bit should be reset (OVM = QO) so the accumulator results will not be 
loaded with the saturation values. Example 12-19 and Example 12-20 show 
64-bit addition and 64-bit subtraction. The first operand is stored in the reg- 
isters RO (low word) and R1 (high word). The second operand is stored in 
R2 and R3 respectively. The result is stored in RO and R1. 
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Example 12-19. 64-Bit Addition 


* TITL 64-BIT ADDITION 
. 
* TWO 64-BIT NUMBERS ARE ADDED TO EACH OTHER PRODUCING A 64-BIT 
* RESULT. THE NUMBERS X (R1,RO) AND Y (R3,R2) ARE ADDED, 
* RESULTING IN W (R1,RO). 
* 
* Rl RO 
* + R3 R2 
Ks a ce ee 
= Rl RO 
* 
ADDI R2,RO 
ADDC R3,R1 


Example 12-20. 64-Bit Subtraction 


* TITL 64-BIT SUBTRACTION 
* 
* TWO 64-BIT NUMBERS ARE SUBTRACTED FROM EACH OTHER PRODUCING 
* A 64-BIT RESULT. THE NUMBERS X (R1,RO) AND Y (R3,R2) ARE 
* SUBTRACTED, RESULTING IN W (R1,RO). 
* 
* Rl RO 
* - R3 R2 
a a 
* R1 RO 
* 
SUBI R2,RO 
SUBB R3,R1 


When two 32-bit numbers are multiplied, a 64-bit product results. The pro- 
cedure for multiplication is to split the 32-bit magnitude values of the multi- 
plicand X and the multiplier Y into two parts (X1,X0) and (X3,X2) respectively 
with 16 bits each. The operation is done on unsigned numbers, and the pro- 
duct is adjusted for the sign bit. Example 12-21 shows the implementation 
of a 32 bit X 32 bit multiplication. 
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Example 12-21. 32 by 32 Bit Multiplication 


* 

* TITL 32 X 32 BIT MULTIPLICATION 

* 

* 

* SUBROUTINE EXTMPY 

* 

* FUNCTION: TWO 32-BIT NUMBERS ARE MULTIPLIED, PRODUCING A 64-BIT 
* RESULT. THE TWO NUMBERS (X and Y) ARE EACH SEPARATED INTO TWO 
* PARTS (X1 XO) AND (Y1 YO), WHERE XO, X1, YO, AND Y1 ARE 16 BITS. 
* THE TOP BIT IN Xl AND Y1 IS THE SIGN BIT. THE PRODUCT IS 

* IN TWO WORDS (WO AND W1). THE MULTIPLICATION IS PERFORMED ON 
* POSITIVE NUMBERS, AND THE SIGN IS DETERMINED AT THE END. 

* 

* 

* X1 xO BITS OF PRODUCTS 

* x ¥1 YO (NOT COUNTING SIGN) PRODUCT 

in in i <i es. ei 

* XO*YO 16+16 Pl 

* XO*Y1 16+16 P2 

* X1*YO 16+16 P3 

* X1*Y1 16+16 P4 

Me se tai ee ei ei oe i es 

* Wl wo 

* 

* 

* ARGUMENT ASSIGNMENTS 

* ARGUMENT | FUNCTION 

ems mm ee a i i a fame ee ee ee ee 

* RO | MULTIPLIER AND LOW WORD OF THE PRODUCT 

* R1 | MULTIPLICAND AND UPPER WORD OF THE PRODUCT 

* 

* 

* REGISTERS USED AS INPUT: RO, R1 

* REGISTERS MODIFIED: RO, R1, R2, R3, R4, ARO, AR1, 

* REGISTER CONTAINING.RESULT: RO,R1 

* 

* 

* CYCLES: 28 (WORST CASE) WORDS: 25 

* 


-GLOBAL EXTMPY 
* 


EXTMPY XOR3 RO,R1,ARO ; Store sign 
ABSI RO : Absolute values of X 
ABSI R1 s and Y 


* SEPARATE MULTIPLIER AND MULTIPLICAND INTO TWO PARTS 


LDI -16,AR1 

LSH3 AR1,RO,R2 ; R2 = Xl = Upper 16 bits of X 
AND OFFFFH,RO ; RO = XO = Lower 16 bits of X 
LSH3 AR1,R1,R3 ; R3 = Yl = Upper 16 bits of Y 
AND OFFFFH,R1 ; Rl = YO = Lower 16 bits of Y 
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* 


* CARRY OUT THE MULTIPLICATION 


MPYI3 RO,R1,R4 ; XO*YO = Pl 

MPYTI R3,RO ; XO*Y1 = P2 

MPYTI R2,R1 >; X1*YO = B3 

ADDI RO,R1 ,; P2+P3 

MPYTI R2,R3 ; X1*Y1l = P4 

LDI R1,R2 

LSH 16,R2 ; Lower 16 bits of P2+P3 

CMPI 0, ARO ; Check the sign of the product 

BGED DONE ; If >0, multiplication complete (delayed) 
AND OFFFFH,R1 ; Upper 16 bits of P2+P3 

ADDI3 R4,R2,R0 ; WO = RO = Lower word of the product 
ADDC3 R1,R3,R1 ; Wl = R1 = Upper word of the product 


* NEGATE THE PRODUCT IF THE NUMBERS ARE OF OPPOSITE SIGN 
* 


DONE 


NOT RO 
ADDI 1,RO0 
NOT R1 
ADDC 0,R1 
RETS 

-end 


12.3.7 Floating-point Format Conversion: IEEE to/from TMS320C30 


In fixed-point arithmetic, the binary point that separates the integer from the 
fractional part of the number is fixed at a certain location. For example, if it 
is chosen that a 32-bit number has the binary point after the most significant 
bit (which is also the sign bit), only fractional numbers (numbers with abso- 
lute values less that 1), can be represented. In this case, it is said that we have 
a Q31 number, where 31 is the number of fractional bits. All operations as- 
sume that the binary point is fixed at this location. 


The fixed-point system, although simple to implement in hardware, imposes 
limitations in the dynamic range of the represented number, which causes 
scaling problems in many applications. The difficulty is avoided by using 
floating-point numbers. A floating-point number consists of a mantissa m 
multiplied by base 6 raised to an exponent e: 


m * b® 


In current hardware implementations, the mantissa is typically a normalized 
number with absolute value between 1 and 2, and the base is b=2. Although 
the mantissa is represented as a fixed-point number, the actual value of the 
overall number floats the binary point because of the multiplication by b©& The 
exponent e is an integer whose value determines the position of the binary 
point in the number. IEEE has established a standard format for the repre- 
sentation of floating-point numbers. 


In order to achieve higher efficiency in the hardware implementation, the 
TMS320C30 uses a floating-point format that differs from the IEEE standard. 
This section describes briefly the two formats and presents software routines 
to convert between them. 
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TMS320C30 floating-point format: 


In a 32-bit word representing a floating-point number, the first 8 bits corre- 
spond to the exponent, expressed in two’s-complement format. There is one 
bit for sign, and 23 bits for the mantissa. The mantissa is expressed in two’s- 
complement form with the binary point after the most significant non-sign bit. 
Since this bit is the complement of the sign bit s, it is suppressed. In other 
words, the mantissa actually has 24 bits. One special case occurs when 
e=-128. In this case, the number is interpreted as zero independent of the 
values of s and f (which are by default set to zero). To summarize, the values 
of the represented numbers in the TMS320C30 floating-point format are: 


2® * (01.f) if s=0 
2° * (10.f) if s=1 
0 if e=-128 


[EEE floating-point format: 


The IEEE floating-point format uses sign-magnitude notation for the mantissa, 
and offset by 127 for the exponent. In a 32-bit word representing a float- 
ing-point number, the first bit is the sign bit. The next 8 bits correspond to 
the exponent, expressed in an offset-by-127 format (the actual exponent is 
e-127). The following 23 bits represent the absolute value of the mantissa 
with the most significant 1 implied. The binary point is after this most signif- 
icant 1. In other words, the mantissa actually has 24 bits. There are several 
special cases, summarized below. 
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The values of the represented numbers in the IEEE floating-point format are: 


(-1)8 * 2-127 * (91) if 0<e<255 


Special cases: 


(-1)§ * 0.0 if e=O and f=O0 (zero) 

(-1)8 * 2-126 * (Qf) if e=O and f<>0 (denormalized) 
(-1)5 * infinity if e=255 and f=0 (infinity) 

NaN if e=255 and f<>0 (Not a Number) 


Based on these definitions of the formats, two versions of the conversion 
routines were developed. One version handles the complete definition of the 
formats. The other ignores some of the special cases (typically the ones that 
are very rarely used), but it has the benefit that it executes faster than the 
complete conversion. For this discussion, they are referred to as the complete 
version and the the fast version. 


12.3.7.1 IEEE to TMS320C30 Floating-Point Format Conversion 


The fast version of the IEEE-to-TMS320C30 conversion routine was originally 
developed by Keith Henry of Apollo Computer, Inc. The other routines were 
based on this initial input. Example 12-22 shows the fast conversion from 
IEEE to TMS320C30 floating-point format. It properly handles the general 
case when 0<e<255, and zeros (i.e., e=0 and f=0). The other special cases 
(denormalized, infinity, and NaN) are not treated and, if present, will give er- 
roneous results. 
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Example 12-22. TMS320C30 To IEEE Conversion (Fast Version) 
TITL IEEE TO TMS320C30 CONVERSION (FAST VERSION) 


SUBROUTINE FMIEEE 


FUNCTION: CONVERSION BETWEEN THE IEEE FORMAT AND THE 
320C30 FLOATING POINT NUMBERS. THE NUMBER TO 
BE CONVERTED IS IN THE LOWER 32 BITS OF RO. 
THE RESULT IS STORED IN THE UPPER 32 BITS OF RO. 
UPON ENTERING THE ROUTINE, AR1 POINTS TO THE 
FOLLOWING TABLE: 


(0) OxFF800000 <-- AR1 
(1) OxFFOO0000 
(2) Ox7FO00000 
(3) 0x80000000 
(4) 0x81000000 


ARGUMENT ASSIGNMENTS : 
ARGUMENT | FUNCTION 


RO 
AR1 


NUMBER TO BE CONVERTED 
POINTER TO TABLE WITH CONSTANTS 


REGISTERS USED AS INPUT: RO, ARI 

REGISTERS MODIFIED: RO, R1 

REGISTER CONTAINING RESULT: RO 

NOTE: SINCE THE STACK POINTER SP IS USED, MAKE SURE TO 
INITIALIZE IT IN THE CALLING PROGRAM. 


CYCLES: 12 (WORST CASE) WORDS: 12 


eeeee te Het Fe HH eH HH HF HF FH HF HHH He HEH HF HE HF HF HK 


.global FMIEEE 


FMIEEE AND3 RO,*AR1,R1 ; Replace fraction with 0 
BND NEG ; Test sign 
ADDI RO,R1 ; Shift sign and exponent inserting 0 
LDIZ *+AR1(1),R1 ; If all zero, generate C30 zero 
SUBI *+AR1(2),R1 3; Unbias exponent 
PUSH Rl 
POPF RO ; Load this as a flt. pt. number 
RETS 

* 

NEG PUSH Rl 
POPF RO ; Load this as a flt. pt. number 
NEGF RO,RO ; Negate if original sign negative 
RETS 


Example 12-23 is the complete conversion between IEEE and TMS320C30 
formats. In addition to the general case and the zeros, it handles the special 
cases as follows: 


@ lf NaN (e=255, f<>Q), the number is returned intact. 


@ If infinity (e=255, f=0); the output is saturated to the most positive or 
negative number respectively. 
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®@ lf denormalized (e=0, f<>0), two cases are considered. If the MSB of 
f is 1, the number is converted to TMS320C30 format. Otherwise, an 
underflow occurs and the number is set to zero. 
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Example 12-23. IEEE to TMS320C30 conversion (complete version) 


TITL IEEE TO TMS320C30 CONVERSION (COMPLETE VERSION) 


SUBROUTINE FMIEEE1 


FUNCTION: CONVERSION BETWEEN THE IEEE FORMAT AND THE 320C30 
FLOATING POINT NUMBERS. THE NUMBER TO BE CONVERTED 
IS IN THE LOWER 32 BITS OF RO. THE RESULT IS STORED 
IN THE UPPER 32 BITS OF RO. 


UPON ENTERING THE ROUTINE, AR1 POINTS TO THE FOLLOWING TABLE: 


(0) OxFF800000 <-- AR1 
(1) OxFFOOO000 
(2) O0x7FOOO000 
(3) 0x80000000 
(4) 0x81000000 
(5) Ox7F800000 
(6) 0x00400000 
(7) OxOO7FFFFF 
(8) Ox7F7FFFFF 


ARGUMENT ASSIGNMENTS : 
ARGUMENT | FUNCTION 


RO 
AR1 


NUMBER TO BE CONVERTED 
POINTER TO TABLE WITH CONSTANTS 


REGISTERS USED AS INPUT: RO, AR1 

REGISTERS MODIFIED: RO, R1 

REGISTER CONTAINING RESULT: RO 

NOTE: SINCE THE STACK POINTER SP IS USED, MAKE SURE TO INITIALIZE 
IT IN THE CALLING PROGRAM. 


CYCLES: 23 (WORST CASE) WORDS: 34 


eee eee He Heese Hee He He He HEH He HEH HHH HEH HEH HH HHH HHH HF He F 


-global FMIEEE1 


FMIEEE1 LDI RO,R1 

AND *+AR1(5),R1 

BZ UNNORM ; If e=0, number is either O or 
* ; unnormalized 

XOR *+AR1(5),R1 

BNZ NORMAL ; If e<255, use regular routine 

* HANDLE NaN AND INFINITY 

TSTB *+AR1(7) ,RO 

RETSNZ ; Return if NaN 

LDI RO,RO 

LDFGT *+AR1(8),RO ; If positive, infinity= 

LDFN : most positive number 

*+AR1(5),RO ; If negative, infinity= 

RETS ; most negative number RETS 
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* HANDLE ZEROS AND UNNORMALIZED NUMBERS 


UNNORM TSTB 
LDFZ 


RETSZ 


XOR 
BND 
LSH 
SUBI 
PUSH 
POPF 
RETS 
NEG1 POPF 
NEGF 
RETS 


*+AR1(6),RO 
*+AR1(3),RO 


*+AR1(6),RO 
NEG1 

1,R0 
*+AR1(2),RO 
RO 

RO 


* HANDLE THE REGULAR CASES 


NORMAL AND3 
BND 
ADDI 
SUBI 
PUSH 
POPF 
RETS 


NEG POPF 
NEGF 
RETS 


RO,*AR1,R1 
NEG 

RO,R1 
*+AR1(2),R1 
Rl 

RO 


s 
tf 
e 
v 


e 
a 


=e 6™ Oe h6™]e CUNO 


Is the msb of £ equal to 1? 
If not, force the number to zero 


and return 


If (msb of £)=1, make it O 


Eliminate sign bit and line up mantissa 


Make e=-127 


Put number in floating point format 


If negative, negate RO 


Replace fraction with 0 


Test sign 


Shift sign and exponent 
Unbias exponent 


Load this as a flt. pt. 


Load this as a flt. pt. 
Negate if original sign 


inserting 0 


number 


number 
negative 


12.3.7.2 TMIS320C30 to IEEE Floating-Point Format conversion 


The vast majority of the numbers represented by the TMS320C30 format are 
covered by the general IEEE format and the representation of zeros. The only 
This 


special case to consider is when e=-127 in the TMS320C30 format. 


corresponds to an denormalized number in IEEE format, and it is ignored in the 
fast version, while it is treated properly in the complete version. 
12-24 shows the fast, and Example 12-25, the complete version of the 


TMS320C30-to-IEEE conversion. 


Example 
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Example 12-24. TMS320C30 to IEEE Conversion (Fast Version) 
TITL TMS320C30 TO IEEE CONVERSION (FAST VERSION) 


SUBROUTINE TOIEEE 


FUNCTION: CONVERSION BETWEEN THE 320C30 FORMAT AND THE IEEE 
FLOATING POINT NUMBERS. THE NUMBER TO BE CONVERTED 
IS IN THE UPPER 32 BITS OF RO. THE RESULT WILL BE IN 
THE LOWER 32 BITS OF RO. 


UPON ENTERING THE ROUTINE, AR1 POINTS TO THE FOLLOWING TABLE: 


(0) OxFF800000 <-- AR1 
(1) OxFFOOOOO0O 
(2) O0x7FO00000 
(3) 0x80000000 
(4) 0x81000000 


ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 


RO 
AR1 


NUMBER TO BE CONVERTED 
POINTER TO TABLE WITH CONSTANTS 


REGISTERS USED AS INPUT: RO, ARI 

REGISTERS MODIFIED: RO 

REGISTER CONTAINING RESULT: RO 

NOTE: SINCE THE STACK POINTER 'SP' IS USED, MAKE SURE TO 
INITIALIZE IT IN THE CALLING PROGRAM. 


CYCLES: 14 (WORST CASE) WORDS: 15 


eee ee eee Hee HHH HH HHH HEHE He HE He HH HF HF HF KF He HF F 


-global TOIEEE 
* 


TOIEEE LDF RO,RO ; Determine the sign of the number 

LDFZ *+AR1(4),RO ; If zero, load appropriate number 
BND NEG ; Branch to NEG if negative (delayed) 
ABSF RO ; Take the absolute value of the number 
LSH 1,R0 ; Eliminate the sign bit in RO 
PUSHF RO 
POP RO ; Place number in lower 32 bits of RO 
ADDI *+AR1(2),RO ; Add exponent bias (127) 
LSH -1,RO ; Add the positive sign 
RETS 

iM NEG POP RO ; Place number in lower 32 bits of RO 
ADDI *+AR1(2),RO ; Add exponent bias (127) 
LSH -1,R0 ; Make space for the sign 
ADDI *+AR1(3),RO ; Add the negative sign 
RETS 
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Example 12-25. TMS320C30 to IEEE Conversion (Complete Version) 


TITL TMS320C30 TO IEEE CONVERSION (COMPLETE VERSION) 
SUBROUTINE TOIEEE1 


FUNCTION: CONVERSION BETWEEN THE 320C30 FORMAT AND THE IEEE 
FLOATING POINT NUMBERS. THE NUMBER TO BE CONVERTED 
IS IN THE UPPER 32 BITS OF RO. THE RESULT WILL BE 
IN THE LOWER 32 BITS OF RO. 


N ENTERING THE ROUTINE, AR1 POINTS TO THE FOLLOWING TABLE: 
OxFF800000 <-- AR1 
OxFFOOOOOO 
Ox7FOO0000 
Ox80000000 
0x81000000 
Ox7F800000 
0x00400000 
OxOO7FFFFF 
Ox7F 7FFFFF 


See were ae ae eS OO 


ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 


NUMBER TO BE CONVERTED 
POINTER TO TABLE WITH CONSTANTS 


RO 
AR1 


REGISTERS USED AS INPUT: RO, ARI 
REGISTERS MODIFIED: RO 
REGISTER CONTAINING RESULT: RO 


NOTE: SINCE THE STACK POINTER 'SP' IS USED, MAKE SURE TO 
INITIALIZE IT IN THE CALLING PROGRAM. 


+t + FF HH HF HH HF HH HF HH HH EH HH SH HF HF HH FH HH He HE HH HF HF HF HK FH H 


CYCLES: 31 (WORST CASE) WORDS: 25 


-global TOIEEE1 
* 


TOIEEE1 LDF RO,RO ; Determine the sign of the number 
LDFZ *+AR1(4),RO ; If zero, load appropriate number 
BND NEG ; Branch to NEG if negative (delayed) 
ABSF RO ; Take the absolute value of the number 
LSH 1,RO ; Eliminate the sign bit in RO 
PUSHF RO 
POP RO ; Place number in lower 32 bits of RO 
ADDI *+AR1(2),RO ; Add exponent bias (127) 
LSH -1,R0 ; Add the positive sign 
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CONT 


NEG 
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TSTB 
RETSNZ 
TSTB 
RETSZ 
PUSH 
POPF 
LSH 
PUSHF 
POP 
ADDI 
RETS 


POP 
BRD 
ADDI 
LSH 
ADDI 


*+AR1(5),RO 
*+AR1(7),RO 


RO 

RO 

-1,R0 

RO 

RO 
*+AR1(6),RO 


RO 

CONT 
*+ARI(2),RO 
=1.,R0 
*+AR1(3),R0 


“oe 


m=O ~e 


=e 


=e ™e NO 


If E>O, return 


If E=0O & F=0, return 


Move F right by one bit 


Add to F amsb of 1 


Place number in lower 32 bits of RO 


Add exponent bias (127) 
Make space for the sign 
Add the negative sign 
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12.4 Application-Oriented Operations 


The TMS320C30 has been designed to provide efficient implementations of 
digital signal processing algorithms. The architecture and the instruction set 
of the device include features that facilitate the solution of numerically inten- 
sive problems. This section presents examples of applications using these 
features, such as companding, filtering, Fast Fourier Transforms (FFT), and 
matrix arithmetic. 


12.4.1 Companding 


In the area of telecommunications, one of the primary concerns is the conser- 
vation of the channel bandwidth, while at the same time preserving high 
speech quality. This is achieved by quantizing the speech samples logarith- 
mically. It has been demonstrated that an 8-bit logarithmic quantizer produces 
speech quality equivalent to a 13-bit uniform quanitizer. The logarithmic 
quantization is achieved by companding (COMpress/exPANDing). Two in- 
ternational standards have been established for companding: the p-law (used 
in the United States and Japan), and the A-law (used in Europe). Detailed 
descriptions of p-law and A-law companding are presented in an application 
report on companding routines included in the book “Digital Signal Process- 
ing Applications with the TMS320 Family”. 


During transmission, logarithmically compressed data in sign-magnitude form 
are transmitted along the communications channel. If any processing is nec- 
essary, these data should be expanded to a 14-bit (for y-law) or 13-bit (for 
A-law) linear format. This operation is done upon receiving the data at the 
digital signal processor. After processing, and in order to continue trans- 
mission, the result is compressed back to 8-bit format and transmitted through 
the channel. 


Examples 12-26 and 12-27 show u-law compression and expansion (i.e., li- 
near to p-law and u-law to linear conversion), while examples 12-28 and 
12-29 show A-law compression and expansion. For expansion, using a 
look-up table offers an alternative approach. It trades memory space for speed 
of execution. Since the compressed data is 8-bits long, a table with 256 en- 
tries can be constructed containing the expanded data. If the compressed data 
is stored in the register ARO, the following two instructions will put the ex- 
panded data in register RO: 


12 
ADDI @TABL, ARO ; @TABL = BASE ADDRESS OF TABLE 120 
LDI *ARO,RO ; PUT EXPANDED NUMBER IN RO 


The same look-up table approach could be used for compression, but the re- 
quired table length would then be 16,384 words for y-law or 8,192 words for 
A-law. If this memory size is not acceptable, the subroutines presented in 
Examples 12-26 or 12-28. should be used. 
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Example 12-26. U-Law compression 


TITL U-LAW COMPRESSION 


SUBROUTINE MUCMPR 


ARGUMENT ASSIGNMENTS: 

ARGUMENT | FUNCTION 

ame ome me me ee oe fe ee cee eee ce ee oe ee ee ee ee oe ee ee oe oe 
NUMBER TO BE CONVERTED 


REGISTERS USED AS INPUT: RO 
REGISTERS MODIFIED: RO, R1, R2, SP 
REGISTER CONTAINING RESULT: RO 


NOTE: SINCE THE STACK POINTER 'SP' IS USED IN THE COMPRESSION 
ROUTINE 'MUCMPR', MAKE SURE TO INITIALIZE IT IN THE 
THE CALLING PROGRAM. 


CYCLES: 20 WORDS: 17 


ee H+ + + FH FH HF HF HH He HE HF HH He HF KF HF HF F 


-global MUCMPR 


* 


MUCMPR LDI RO,R1 ; Save sign of number 
ABSI RO,RO 
CMPTI 1LFDEH, RO > If RO>DOXxIFDE, 
LDIGT 1FDEH,RO 3; Saturate the result 
ADDI 33,R0 ; Add bias 
FLOAT RO ; Normalize: (seg+5)OWXYZx...x 
MPYF 0.03125,R0 ; Adjust segment number by 2**(-5) 
LSH 1,RO >; (seg)WXYZx...x 
PUSHF RO 
POP RO ; Treat number as integer 
LSH -20,R0 ; Right-justify 
LDI 0,R2 
LDI R1,R1 ; If number is negative, 
LDILT 80H,R2 ; set sign bit 
ADDI R2,RO ; RO = compressed number 
NOT RO ; Reverse all bits for transmission 
RETS 
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Example 12-27. U-Law Expansion 


+ + F Ft + HE HF EF HF F 


* 


RO 


e+ + + FF SF FF HF He H HF 


MUXPND 


TITL 


CYCLES: 


"U-LAW EXPANSION' 


SUBROUTINE 


ARGUMENT | 
ow oe ae om ww oe oe oe we fe cm ee ee ee ce ne eee te ce oe ce ee ne ee mee we ee ee oe 


NUMBER TO BE CONVERTED 


MUXPND 


ARGUMENT ASSIGNMENTS : 


FUNCTION 


20 (WORST CASE) 


-global MUXPND 


NOT 


RO,RO 
RO,RO 
OFH,R1 
1,R1 
33,R1 
RO,R2 
-4,R0 
7,RO 
RO,R1,RO 
33,R0 
80H,R2 


RO 


REGISTERS USED AS INPUT: RO 
REGISTERS MODIFIED: RO, Rl, 
REGISTER CONTAINING RESULT: 


ue 


=e 


=e 6 %e 6M e fe =e 6M 


=e 


R2, SP 
RO 


WORDS: 14 


Complement bits 
Isolate quantization bin 


Add bias to introduce 1xxxxl 
Store for sign bit 


Isolate segment code 

Shift and put result in RO 
Subtract bias 

Test sign bit 


Negate if a negative number 
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Example 12-28. A-Law Compression 
TITL A-LAW COMPRESSION 
SUBROUTINE  ACMPR 
ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 
RO | NUMBER TO BE CONVERTED 
REGISTERS USED AS INPUT: RO 
REGISTERS MODIFIED: RO, Rl, R2, SP 
REGISTER CONTAINING RESULT: RO 
NOTE: SINCE THE STACK POINTER 'SP' IS USED IN THE COMPRESSION 


ROUTINE 'ACMPR', MAKE SURE TO INITIALIZE IT IN THE 
CALLING PROGRAM. 


CYCLES: 22 WORDS: 19 


+ eee He He HF HEH He He HHH He HE FH He He HF HK F 


.globl ACMPR 


ACMPR LDI RO,R1 ; Save sign of number 
ABSI RO,RO 
CMPI 1FH,RO ; If RO<Ox20, 
BLED END ; Do linear coding 
CMPI OFFFH, RO ; If RO>DOXxFFF, 
LDIGT OFFFH,RO ; saturate the result 
LSH -1,R0 ; Eliminate rightmost bit 
FLOAT RO ; Normalize: (seg+3)OWXYZx...x 
MPYF 0.125,R0 ; Adjust segment number by 2**(-3) 
LSH 1,R0O ; (seg)WXYZx...x 
PUSHF RO 
POP RO ; Treat number as integer 
LSH -20,R0 ; Right-justify 
END LDI 0,R2 
LDI R1,R1 ; If number is negative, 
LDILT 80H,R2 ; set sign bit 
ADDI R2,RO ; RO = compressed number 
XOR OD5H,RO ; Invert even bits for transmission 
RETS 


12-48 


Software Applications - Application-Oriented Operations 


Example 12-29. A-Law Expansion 


t+ + + e FE FH HF SF HF eH HF HF FE HF FH HF SH KH F 


TITL A-LAW EXPANSION 


SUBROUTINE AXPND 


ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 
ee ee ee ee ee oe es oe a ee ee a ee 
RO | NUMBER TO BE CONVERTED 


REGISTERS USED AS INPUT: RO 
REGISTERS MODIFIED: RO, Rl, R2, SP 
REGISTER CONTAINING RESULT: RO 


CYCLES: 25 (WORST CASE) WORDS: 16 
-global AXPND 
* 
AXPND XOR D5H,RO ; Invert even bits 
LDI RO,R1 
AND OFH,R1 ; Isolate quantization bin 
LSH 1,R1 
LDI RO,R2 ; Store for bit sign 
LSH -4,R0 
AND 7,R0 ; Isolate segment code 
BZ SKIP1 
SUBI 1,R0 
ADDI 32,R1 ; Create 1lxxxxl 
SKIP1 ADDI 1,R1 ; OR Oxxxxl1 
LSH3 RO,R1,R0 ; Shift and put result in RO 
TSTB 80H,R2 ; Test sign bit 
RETSZ 
NEGI RO ; Negate if a negative number 
RETS 


12.4.2 FIR, IIR, and Adaptive Filters 


Digital filters are a common requirement for digital signal processing systems. 
There are two types of digital filters, Finite Impulse Response (FIR) and Infi- 
nite Impulse Response (IIR). Each of these types can have either fixed or 
adaptable coefficients. In this section, the fixed-coefficient filters are pre- 


sented first, and then the adaptive filters are discussed. 
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12.4.2.1 FIR Filters 
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If the FIR filter has an impulse response A[O], A[1],.... A[N-1], and x[n] re- 
presents the input of the filter at time n, the output y[n] at time 7 is given by 
the equation: 


y[n] =A[O] x[n] + Ali] x [n-1] + ...+ ALN-1] x[n-(N-1)] 


Two features of the TMS320C30 that facilitate the implementation of the FIR 
filters, are parallel multiply/add operations and the circular addressing. The 
first one permits the performance of a multiplication and an addition in a single 
machine cycle, while the second one makes a finite buffer of length N suffi- 
cient for the data x. 


Figure 12-1 shows the arrangement of the memory locations in order to im- 
plement the circular addressing, while Example 12-30 presents the 
TMS320C30 assembly code for an FIR filter. 


impulse initial final 
low response input samples input samples 


| R(IN=2) 


circular 
queue 


e 
e 
td] 
high newest input 


address 


Figure 12-1. Data Memory Organization For a FIR Filter 


In order to set up the circular addressing, the block-size register BK should 
be initialized to block length N. Also, the locations for signal x should start 
from a memory location whose address is a multiple of the smallest power of 
2 that is greater than or equal to N. For instance, if N=24, the first address for 
x should be a multiple of 32 (the lower 5 bits of the beginning address should 
be zero). To understand this requirement, look at the section describing cir- 
cular addressing. 
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In Example 12-30, the pointer to the input sequence x, is incremented and 
assumed to be moving from an older input to a newer input. At the end of the 
subroutine, AR1 will be pointing to the position for the next input sample. 


Example 12-30. FIR Filter 
TITL FIR FILTER 


SUBROUTINE F IR 


EQUATION: y(n) = h(O) * x(n) + h(1) * x(n-1) + 
--. + A(N-1) * x(n-(N-1)) 


TYPICAL CALLING SEQUENCE: 


LOAD ARO 
LOAD AR1 
LOAD RC 
LOAD BK 
CALL FIR 


ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 


eee em - me me ee we a a a a a a a a a ee eee ee 
ARO | ADDRESS OF h(N-1) 

ARI | ADDRESS OF x(N-1) 

RC | LENGTH OF FILTER - 2 (N-2) 
BK | LENGTH OF FILTER (N) 


REGISTERS USED AS INPUT: ARO, AR1, RC, BK 
REGISTERS MODIFIED: RO, R2, ARO, AR1, RC 
REGISTER CONTAINING RESULT: RO 


+ + + ee FE HF HF HF HF HF HE FHF HH HEH HF HF HF HF HF HF HH HF HF HH HF HF HF HF HK HF 


CYCLES: 11 + (N-1) WORDS: 6 
-global FIR 
* ; Initialize RO: 
FIR MPYF3 *ARO++(1),*AR1++(1)%,RO 
* : h(N-1) * x(n-(N-1)) -> RO 
LDF 0.0,R2 > Initialize R2. 


* FILTER ( 1 <= i < N) 


RPTS RC ; Setup the repeat cycleE. 42 
MPYF3 *ARO++(1),*AR1++(1)%,RO ; h(N-1-i)*x(n-(N-1-i))->RO 

| | ADDF3 RO,R2,R2 ; Multiply and add operation 

* 
ADDF RO,R2,RO ; Add last product 


* RETURN SEQUENCE 
RETS ; Return 


* end 
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12.4.2.2 IIR Filters 


12-52 


The transfer function of the IIR filters has both poles and zeros. Its output 
depends on both the input and the past output. As a rule, they need less 
computation than an FIR with similar frequency response, but they have the 
drawback of being sensitive to coefficient quantization. Most often, the IIR 
filters are implemented as a cascade of second-order sections, called biquads. 
Examples 12-31 and 12-32 show the implementation for one biquad and for 
any number of biquads respectively. 


The equation for a single biquad is given by: 
y[n] = a1 y[n-1]+ a2 y[n-2]+ bO x[n]+ 61 x[n-1]+ b2 x[n-2] 


This equation can be implemented more conveniently by the following two 
equations which have less storage requirements: 


d[n] = a2 d[n-2]+ a1 d[n-1]+ x[n] 
y[n] = 62 d[n-2]+ 61 d[n-1]+ 60 d[n] 


Figure 12-2 shows the memory organization for this approach, and Example 
12-31 is an implementation of a single biquad on the TMS320C30. 


filter initial delay final delay 
iow coefficients node values node values 
address | a2_——s|snewest delay [| din) | din=1) 
| din-1) | | din 2) | circular queue 
ae 
high | bo 


address 


Figure 12-2. Data Memory Organization For a Single Biquad 


As in the case of FIR filters, the address for the start of the values d must be 
a multiple of 4, i.e., the last two bits of the beginning address must be zero. 
The block-size register BK must be initialized to 3. 
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Example 12-31. IIR Filter (One Biquad) 


TITL IIR filter 


SUBROUTINE IIRl1 


IIR1 == IIR FILTER (ONE BIQUAD) 


EQUATIONS: d(n) 
y(n) 


a2 * d(n-2) + al * d(n-1) + x(n) 
b2 * da(n-2) + bl * d(n-1) + bO * d(n) 


WoW 


OR y (n) al*y(n-1) + a2*y(n-2) + bO*x(n) 


+ b1l*x(n-1) + b2*x(n-2) 


TYPICAL CALLING SEQUENCE: 


load R2 
load ARO 
load AR1 
load BK 
CALL IIRL 


ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 


+ 
R2 | INPUT SAMPLE X(N) 
| ADDRESS OF FILTER COEFFICIENTS (A2) 
| ADDRESS OF DELAY MODE VALUES (D(N-2) ) 
| BK = 3 
REGISTERS USED AS INPUT: R2, ARO, ARI, BK 
REGISTERS MODIFIED: RO, R1, R2, ARO, AR1 
REGISTER CONTAINING RESULT: RO 


CYCLES: 11 WORDS: 8 


+t + eH HF HH HH HH HF HF FH HH HF HH HF HH He HF FH He FH FH HH F SH HF FH HE HF KH HF 


FILTER 


-global IIRI1 
* 
IIR1 MPYF3 *ARO,*AR1,RO 
x > a2 * d(n-2) -> RO 
MPYF3 *++ARO(1),*AR1--(1)%,R1 


* >; b2 * da(n-2) -> R1 
* 
MPYF3 *++ARO(1),*AR1,RO ; al * d(n-1) -> RO 
| | ADDF3 #£RO,R2,R2 ; a2*d(n-2)+x(n) -> R2 12 
* 
MPYF3 *++ARO(1),*AR1--(1)%,RO ; bl * d(n-1) -> RO 
| | ADDF 3 RO,R2,R2 >; al*d(n-1)+a2*d(n-2)+x(n) -> R2 
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MPYF3 *++ARO(1),R2,R2 ; bO * d(n) -> R2 
| STF R2,*AR1++(1)% 


* ; Store d(n) and point to d(n-1). 
* 
ADDF RO,R2 > bi*d(n-1)+bO*d(n) -> R2 
ADDF R1,R2,RO ; b2*d(n-2)+b1*d(n-1)+bO*d(n) <-> RO 


* RETURN SEQUENCE 
RETS ; Return 
* end 
.end 
In the more general case, the IIR filter will contain N>1 biquads. The 


equations for its implementation are given by the following pseudo-C lan- 
guage code: 


y[0,n} = x[n] 

for (/=0; i<N; /++){ 
d[in}] = a2[/] di, n-2] + a1[f] dl in-1] + y[ /-1,n] 
y[in] = 62[f d[i.n-2]+ b1[/] d[in-1]+ b0[Af d[in] 


} 
y[n] = y({N-1, n] 


The corresponding memory organization is shown in Figure 12-3, while Ex- 
ample 12-32 is the TMS320C30 assembly-language code. 
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filter initial delay final delay 
low coefficients node values node values 
address | _—_—a2(0)_—Ss|~newest delay [|  d(O,n) | | d(O,n-1) | 
_b20)_—_ [di0,n-1) | [| dl0, n—2) | circular queue 
[—atl0) | oldest delay [d0,n-2) | [On 
[_bt0) 
| BOO) Ff 
e ® 
e e e 
e e e 
. [_diN-tn) | ([diN-1,n-1) | 
| adiN-1) | diN-1,n—1) | | d(N-—1,n—2) | circular queue 
| b2(N-1) | | diN-1,n—2) | | d(N-1,n) | 
| at(N-1) | | __empty =| [empty 
| bIIN-1) 
high {| _ bO(N—1) _—| 
address 


Figure 12-3. Data Memory Organization For N Biquads 


The block register BK should be initialized to 3, and the beginning of each set 
of d values (i.e., d[i,n], /=0...N-1) should be at an address that is a multiple 
of 4 (the last two bits zero), as stated in the case of a single biquad. 
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Example 12-32. IIR Filters (N > 1 Biquads) 
TITL IIR FILTERS (N > 1 BIQUADS) 


SUBROUTINE IIR2 


REGISTERS USED AS INPUT: R2, ARO, AR1, IRO, IR1, BK, RC 
REGISTERS MODIFIED: RO, R1, R2, ARO, AR1, RC 
REGISTER CONTAINING RESULT: RO 


CYCLES: 23 + 6(N-1) WORDS: 17 


* 

* 

* 

* 

* 

* 

* 

* 

* EQUATIONS: y(O,n) = x(n) 

* 

* FOR (i = O; i < Nj itt) 

. { 

* Q(i,n) = a2(i) * da(i,n-2) + al(i) * d(i,n-1) + y(i-1,n 
* y(i,n) = b2(i) * A(i,n-2) + bl(i) * A(i,n-1) + bO(i) * d(i,n) 
’ } 

. y(n) = y(N-1,n) 

* TYPICAL CALLING SEQUENCE: 

* 

“ load R2 

“ load ARO 

“ load AR1 

" load IRO 

ie load IR1 

x load BK 

x load RC 

* CALL TIR2 

* 

* 

* ARGUMENT ASSIGNMENTS: 

* ARGUMENT | FUNCTION 

, ee me me me ee ee ee ee ee ee 

* R2 | INPUT SAMPLE x(n) 

i ARO { ADDRESS OF FILTER COEFFICIENTS (a2(0)) 
ig ARI | ADDRESS OF DELAY NODE VALUES (d(0,n-2)) 
* BK | BK = 3 

. IRO | IRO = 4 

* IR1 | IR1L = 4*N-4 

* RC | NUMBER OF BIQUADS (N) - 2 
* 

* 

* 

* 

* 

* 

* 

* 

* 

* 


-gJlobal IIR2 


IIR2 MPYF3 *ARO, *AR1, RO 

* ; a2(0) * d(0O,n-2) -> RO 
MPYF3 *+4+ARO(1), *AR1--(1)%, R1 

* ; b2(0) * d(0O,n-2) -> R1 
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MPYF3 *++ARO(1),*AR1,RO ; al(0) * D(O,n-1) -> RO 
{ | ADDF3 RO, R2, R2 ; First sum term of d(0,n). 


MPYF3 *++ARO(1),*ARI--(1)%,RO ;b1(0) * d(0O,n-1) -> RO 
{ | ADDF 3 RO, R2, R2 ; Second sum term of d(0O,n). 
MPYF3 *++ARO(1),R2,R2 ;bO0(0) * d(0O,n) -> R2 
I | STF R2, *AR1--(1)% 
* ; Store d(0,n); Point to d(0,n-2) 


RPTB LOOP ; Loop for 1 <=i<n 


MPYF3 *++ARO(1),*++AR1(IRO),RO ;a2(i) * d(i,n-2) -> RO 
| | ADDF 3 RO,R2,R2 ; First sum term of y(i-l1,n). 


MPYF3 *++ARO(1),*AR1--(1)%,R1 ;b2(i) * D(i,n-2) -> R1 
| | ADDF 3 R1,R2,R2 ; Second sum term of y(i-1,n). 


MPYF3 *++ARO(1),*AR1,RO ;al(i) * d(i,n-1) -> RO 
| | ADDF 3 RO, R2, R2 ; First sum of d(i,n). 
* 


MPYF3 *++ARO(1),*AR1--(1)%,RO ;b1(i) * d(i,n-1) -> RO 
| | ADDF 3 RO, R2, R2 ; Second sum term of d(i,n). 


STF R2, *AR1--(1)% 
* ; Store d(i,n); point to d(i,n-2) 
LOOP MPYF3 *++ARO(1), R2, R2 
* > bO(i) * A(i,n) -> R2 


* 
* FINAL SUMMATION 
* 


ADDF RO,R2 ; First sum term of y(n-1,n) 

ADDF3 #£R1,R2,RO ; Second sum term of y(n-1,n) 
* 

NOP *AR1--(IR1) ; Return to first biquad 

NOP *ARI--(1)% ; Point to d(0,n-1) 


* RETURN SEQUENCE 
RETS ; Return 
* end 


-end 


12.4.2.3 Adaptive Filters (LMS Algorithm) 


There are applications in digital signal processing where a filter must be 
adapted over time to keep track of changing conditions. The book “Theory 

and Design of Adaptive Filters” by Treichler, Johnson, and Larimore (Wiley- 
Interscience, 1987) presents the theory of adaptive filters. Although in theory 

both FIR and IIR structures can be used as adaptive filters, the stability prob- 

lems and the local optimum points that the IIR filters exhibit, make them less 
attractive for such an application. Hence, until further research makes IIR fil- 

ters a better choice, only the FIR filter are used in adaptive algorithms of 
practical applications. 


In an adaptive FIR filter, the filtering equation takes the form: 
y(n] = A[n,O] x{n]+ Al[n,1] x[n-1]+...+ Aln,N-1] x[n-(N-1)] 
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The filter coefficients are time-dependent. In a least-mean-squares (LMS) 
algorithm, the coefficients are updated by a formula of the form: 


A{(n+1,/] = Afni] +B x[n-/], /=0,1,...,N-1 


B is a constant for the computation. The updating of the filter coefficients can 
be interleaved with the computation of the filter output so that it takes 3 cycles 
per filter tap to do both. The updated coefficients are written over the old 
filter coefficients. Example 12-33 shows the implementation of an adaptive 
FIR filter on the TMS320C30. The memory organization and the positioning 
of the data in memory should follow the same rules as the above FIR filter with 
fixed coefficients. 
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Example 12-33. Adaptive FIR Filter (LMS Algorithm) 


* 


eee FHF HHH HHH HOH]M FHF FHF HHH He HF HH He FH H HF HF FH HF SH HF H HF H HF 


TITL ADAPTIVE FIR FILTER (LMS ALGORITHM) 


SUBROUTINE LM S 


LMS == LMS ADAPTIVE FILTER 


EQUATIONS: y(n) = h(n,0)*x(n) + h(n,1)*¥x(n-1) + 
+ h(n,N-1)*x(n-(N-1) ) 
FOR (i = 0; i < N; itt) 
h(n+t1,i) = h(n,i) + tmuerr * x(n-i) 


TYPICAL CALLING SEQUENCE: 
load R4 


load ARO 
load ARI 


load RC 
load BK 
CALL FIR 


ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 


+ 

| SCALE FACTOR (2 * mu * err) 
ARO | ADDRESS OF h(n,N-1) 

| 

| 

| 


AR1 ADDRESS OF x(n-(N-1)) 
RC LENGTH OF FILTER - 2 (N-2) 
BK LENGTH OF FILTER (N) 


REGISTERS USED AS INPUT: R4, ARO, AR1, RC, BK 
REGISTERS MODIFIED: RO, R1, R2, ARO, AR1, RC 
REGISTER CONTAINING RESULT: RO 

PROGRAM SIZE: 10 words 


EXECUTION CYCLES: 12 + 3(N-1) 


SETUP (i = 0) 


.-global LMS 

* ; Initialize RO: 

LMS MPYF3 *ARO, *AR1, RO 

= >; h(n,N-1) * x(n-(N-1)) -> RO 
LDF 0.0,R2 ; Initialize R2. 


; Initialize Rl: 
MPYF3 *AR1++(1)%, R4, R1 
; x(n-(N-1)) * tmuerr -> RIL 
ADDF3 *ARO++(1), Rl, Rl 
> h(n,N-1) + x(n-(N-1)) * 
tmuerr -> Rl 
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* FILTER AND UPDATE ( 1 <= I < N) 


RPTB LOOP ; Setup the repeat block. 
* 
* >; Filter: 
MPYF3 *ARO-=(1),ARI,RO. > h(n ,;N=1=1). * x(n-(N=1=1)) => RO 
{ | ADDF 3 RO,R2,R2 ; Multiply and add operation. 
* 
- ; UPDATE: 
MPYF3 *AR1++(1)%,R4,R1 ;x(n,N-(N-1-i)) * tmuerr -> RI 
| | STF R1,*ARO++(1) ; R1 -> h(n+1,N-1-(i-1)) 
* 
LOOP ADDF3 *ARO++(1), R1, R1 
* ; h(n,N-1-1i) + x(n-(N-1-1))*tmuerr -> R1 
* 
ADDF3 RO,R2,R0 ; Add last product. 
STF Ri,*=ARO(1) ; h(n,0) + :x(n) * tmuerr => htnt1,0) 


* RETURN SEQUENCE 
RETS ; Return 
* end 


.end 


12.4.3 Matrix-Vector Multiplication 


In matrix-vector multiplication, a K x N matrix of elements m(i,j) having K 
rows and N columns is multiplied by an N x 1 vector to produce a K x 1 result. 
The multiplier vector has elements v(j), and the product vector has elements 
p(i). Each one of the product-vector elements is computed by the expression: 


p(i)= m(i,0) v(O)+ mU,1) v(1)+...+ mU,N-1) v(N-1) 7 = 0,1,....K-1 


This is essentially a dot product, and the matrix-vector multiplication contains 
as a special case the dot product presented in Example 12-2. In pseudo-C 
format, the computation of the matrix multiplication is expressed by: 


for (i=0; i<K; i++) { 
p(i) =0 
for ((=0; /<Njj++) 
P(t) = p(/) + mii)" vi) 


Figure 12-4 shows the data memory organization for matrix-vector multipli- 
cation, and Example 12-34 is the TMS320C30 assembly code to implement 
it. Note that in Example 12-34, K (number of rows) should be greater than 0 
and N (number of columns) should be greater than 1. 
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input result 
es oo ee vector storage vector storage 
| m0, 1) | CG CET 
r e bad 
@ @ 8 
e e e 
[m(0,N-1) | 
Pmt, OO) 
high | m7.) 
address 
e 
e 
® 


Figure 12-4. Data Memory Organization for Matrix-Vector Multiplication 
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Example 12-34. Matrix times a vector multiplication 
TITL MATRIX TIMES A VECTOR MULTIPLICATION 


SUBROUTINE MAT 


MAT == MATRIX TIMES A VECTOR OPERATION 


TYPICAL CALLING SEQUENCE: 


load ARO 
load AR1 
load AR2 
load AR3 
load Rl 

CALL MAT 


ARGUMENT ASSIGNMENTS : 
argument | FUNCTION 


ee ee ee -e Kew wr ewe ee ew ee ew eee eee 

ARO | ADDRESS OF M(0,0) 

AR1 | ADDRESS OF V(0) 

AR2 | ADDRESS OF P(0) 

AR3 | NUMBER OF ROWS - 1 (K-1) 

R1 | NUMBER OF COLUMNS - 2 (N-2) 


REGISTERS USED AS INPUT: ARO, AR1, AR2, AR3, R1 
REGISTERS MODIFIED: RO, R2, ARO, AR1, AR2, AR3, IRO, 
RC, RSA, REA 

PROGRAM SIZE: 11 


EXECUTION CYCLES: 6 + 10 * K + K * (N - 1) 


tee eee Hee eH H HHH He HH HHH HH HEH HEH HEH HHH HH HH KH He HK 


.global MAT 

* 

* SETUP 

* 

MAT LDI R1,IRO ; number of columns-2 -> IRO 
ADDI 2,1R0 ; IRO =N 


* FOR (i = 0; i < K; i++) LOOP OVER THE ROWS. 
* 


ROWS LDF 0.0,R2 ; initialize R2 
MPYF3 *ARO++(1),*AR1++(1),R0 
> m(i,O) * v(0O) -> RO 


FOR (j = 1; 4 < N; j++) DO DOT PRODUCT OVER COLUMNS 


+t +e + F 


RPTS R1 ; multiply a row by a column. 


MPYF3 *ARO++(1),*AR1++(1),RO ;m(i,j) * v(j) -> RO 
{| ADDF3 RO,R2,R2 ; m(i,j-1) * v(j-1) + R2 -> R2. 
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+ t+ + F 


ee 


DBD AR3,ROWS ; counts the number of rows left. 
ADDF RO,R2 ; last accumulate. 

STF R2,*AR2++(1) ; result -> p(i) 

NOP *--AR1(IRO) ; set AR1 to point to v(0). 


DELAYED BRANCH HAPPENS HERE !!! 


RETURN SEQUENCE 


end 


RETS ; return 


end 


12.4.4 Fast Fourier Transforms (FFT) 


Fourier transforms are an important tool often used in digital signal processing 
systems. The purpose of the transform is to convert information from the time 
domain to the frequency domain. The inverse Fourier transform converts in- 
formation back to the time domain from the frequency domain. Implementa- 
tion of Fourier transforms that are computationally efficient are known as Fast 
Fourier Transforms (FFTs). The theory of FFTs can by found in books such 
as "DFT/FFT and Convolution Algorithms” by C.S. Burrus and T.W. Parks 
(John Wiley, 1985), and in the book “Digital Signal Processing Applications 
with the TMS320 Family”. 


The TMS320C30 has many features that permit very efficient implementation 
of numerically intensive algorithms. Some of these features are particularly 
well suited for FFTs. The high speed of the device (60 ns cycle-time) makes 
easier the implementation of real-time algorithms, while the floating-point 
capability eliminates the problems associated with dynamic range. The pow- 
erful indexing scheme in indirect addressing facilitates the access of FFT but- 
terfly legs that have different spans. A construct that reduces the looping 
overhead in algorithms heavily dependent on loops (such as the FFTs), is the 
repeat block implemented by the RPTB instruction. This construct gives the 
efficiency of in-line coding, but has the form of a loop. Since the output of 
the FFT is in scrambled (bit-reversed) order when the input is in regular order, 
there is a need to restore it in the proper order. In the TMS320C30, there is 
no need to spend extra cycles for this rearrangement. The device has a special 
form of indirect addressing (bit-reversed addressing mode), that can be used 
when the FFT output is needed. This mode permits accessing the FFT output 
in the proper order. 


Fast Fourier Transform is a label for a collection of algorithms implementing 
efficient conversion from time to frequency domain. Different types of FFT 
are: 


@ Radix-2 and radix-4 algorithms(depending on the size of the FFT but- 
terfly) 


® Decimation in time or frequency (DIT or DIF) 
Complex or real FFTs 
G FFTs of different lengths, etc. 
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The present implementation of the FFT was based on programs contained in 
the book “DFT/FFT and Convolution Algorithms” by C.S. Burrus and T.W. 
Parks, and in the paper ”Real-Valued Fast Fourier Transform Algorithms” by 
H.V. Sorensen et al. (IEEE Trans. on ASSP, June 1987). 


Examples 12-35 and 12-36 show the implementation of a complex radix-2, 
DIF, FFT on the TMS320C30. Example 12-35 contains the generic code of 
the FFT that can be used with any length number. However, for the complete 
implementation of an FFT, a table of twiddle factors (sines/cosines) is needed, 
and this table depends on the size of the transform. To retain the generic form 
of Example 12-35, the table with the twiddle factors (containing 1 1/4 com- 
plete cycles of a sine) is presented separately in Example 12-36 for the case 
of a 64-point FFT. A full cycle of a sine should have a number of points equal 
to the FFT size. In Example 12-36, the FFT length N and M, which is equal 
to the logarithm of N to base equal to the radix, are defined. M is the number 
of stages of the FFT. For a 64-point FFT, M=6 when using a radix-2 algo- 
rithm or M=3 when using a radix-4 algorithm. If the table with the twiddle 
factors and the FFT code are kept in separate files, they should be connected 
at link time. 
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Example 12-35. Complex, Radix-2, DIF FFT 


DIF FFT 


GENERIC PROGRAM FOR A LOOPED-CODE RADIX-2 FFT COMPUTATION IN 320C30 


THE PROGRAM IS TAKEN FROM THE BURRUS AND PARKS BOOK, P. 
THE (COMPLEX) DATA RESIDE IN INTERNAL MEMORY. 


L11.. 
THE COMPUTATION 


IS DONE IN-PLACE, BUT THE RESULT IS MOVED TO ANOTHER MEMORY 
SECTION TO DEMONSTRATE THE BIT-REVERSED ADDRESSING. 


THE TWIDDLE FACTORS ARE SUPPLIED IN A TABLE PUT IN A .DATA SECTION. 
THIS DATA IS INCLUDED IN A SEPARATE FILE TO PRESERVE THE GENERIC 


FOR THE SAME PURPOSE, THE SIZE OF THE FFT 


N AND LOG2(N) ARE DEFINED IN A .GLOBL DIRECTIVE AND SPECIFIED 


* 

* TITL COMPLEX, RADIX-2, 

* 

* 

* 

* 

* 

* 

* 

* 

x 

* 

* NATURE OF THE PROGRAM. 

* 

* DURING LINKING. 

* 

* 
-gJlobl FFT 
-globl N 
-globl M 
-globl SINE 

INP eusect "IN",1024 
-BSS OUTP,1024 
.text 

* INITIALIZE 

FFTSIZ .word N 

LOGFFT .word M 

SINTAB .word SINE 

INPUT word INP 

OUTPUT .word OUTP 

FFT: LDP FFTSIZ 
LDI @FFTSIZ,IRI1 
LSH -2,IR1 
LDI 0,AR6 
LDI @FFTSIZ,IRO 
LSH 1,IRO 
LDI @FFTSIZ,R7 
LDI 1,AR7 
LDI 1,AR5 

* OUTER LOOP 

LOOP: NOP *++AR6(1) 
LDI @INPUT,ARO 
ADDI R7,ARO,AR2 
LDI AR7,RC 
SUBI 1,RC 


ot i eT eT | 


=e we Ne NO 


Entry point for execution 
FFT size 

LOG2 (N) 

Address of sine table 


Memory with input data 
Memory with output data 


Command to load data page pointer 


IRI=N/4, pointer for SIN/COS table 
AR6 holds the current stage number 


IRO=2*N1 (because of real/imag) 

R7=N2 

Initialize repeat counter of first loop 
Initialize IE index (AR5=IE) 


Current FFT stage 
ARO points to X(T) 
AR2 points to X(L) 


RC should be one less than desired # 
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* FIRST LOOP 


RPTB BLK1 
ADDF *ARO, *AR2,RO >; RO=X(I)+X(L) 
SUBF *AR2++,*ARO++,RiI R1I=X(1I)-X(L) 
ADDF *AR2,*ARO,R2 > R2=Y(I)+Y(L) 
SUBF *AR2,*ARO,R3 >; R3=Y(1I)-Y(L) 
STF R2,*ARO-- > Y¥(I)=R2 and... 
| | STF R3,*AR2-- ; Y(L)=R3 
BLK1 STF RO,*ARO++(IRO) ; X(I)=RO and... 
| | STF R1,*AR2++(IRO) ; X(L)=R1 and ARO,2 = ARO,2 + 2¥*n 


* IF THIS IS THE LAST STAGE, YOU ARE DONE 


CMPI @LOGFFT,AR6 
BZD END 


* MAIN INNER LOOP 


LDI 2,AR1 ; Init loop counter for inner loop 
LDI @SINTAB,AR4 ; Initialize IA index (AR4=IA) 
INLOP: ADDI AR5,AR4 ; IA=IA+tIE; AR4 points to cosine 
LDI AR1,ARO 
ADDI 2,AR1 ; Increment inner loop counter 
ADDI @INPUT,ARO ; (X(I),Y¥(I)) pointer 
ADDI R7,ARO,AR2 ; (X(L),¥(L)) pointer 
LDI AR7,RC 
SUBI 1,RC > RC should be one less than desired # 
LDF *AR4,R6 ; RO=SIN 


* SECOND LOOP 


RPTB BLK2 
SUBF *AR2,*ARO,R2 ; R2=X(1)-X(L) 
SUBF *+AR2,*+ARO,R1L 
bs ; R1l=Y(1I)-Y(L) 
MPYF R2,R6,RO ;RO=R2*SIN and... 
| | ADDF *+AR2,*+ARO,R3 
; R3=Y(1I)+Y(L) 
MPYF R1,*+AR4(IR1),R3 ;R3 = R1 * COS and 
| | STF R3,*+ARO ; Y(1I)=Y¥(1I)+Y(L) 
SUBF RO,R3,R4 ; R4=R1*COS-R2*SIN 
MPYF R1,R6,RO ; RO=R1*SIN and... 
| | ADDF *AR2,*ARO,R3 ; R3=X(1I)+X(L) 
MPYF R2,*+AR4(IR1),R3 ; R3 = R2 * COS and... 
| | STF R3,*ARO++(IRO) 
* ; X(1I)=X(1I)+X(L) and ARO=ARO+2*N1 
ADDF RO,R3,R5 ; R5=R2*COS+R1*SIN 
BLK2 STF R5,*AR2++(IRO) ; X(L)=R2*COS+R1*SIN, incr AR2 and... 
| | STF R4,*+AR2 ; Y¥(L)=R1*COS-R2*SIN 
CMPI R7,AR1 
BNE INLOP -; Loop back to the inner loop 
LSH 1,AR7 ; Increment loop counter for next time 
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BRDP LOOP5 ; Next FFT stage (delayed) 
LSH 1,AR5 >; IE=2*1IE 
LDI R7,1IRO ; N1=N2 
LSH -1,R7 > N2=N2/2 
* STORE RESULT OUT USING BIT-REVERSED ADDRESSING 
END: LDI @FFTSIZ,RC ; RC=N 
SUBI 1,RC ; RC should be one less than desired # 
LDI @FFTSIZ,IRO ; IRO=size of FFT=N 
LDI 2,IR1 
LDI @INPUT,ARO 
LDI @OUTPUT,AR1 
RPTB BITRV 
LDF *+ARO(1),RO 
{| LDF *ARO++(IRO)B,RI1 
BITRV STF RO,*+AR1(1) 
| STF R1, *AR1++(IR1) 
SELF BR SELF ; Branch to itself at the end 
.end 
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Example 12-36. Table With Twiddle Factors For A 64-Point FFT 


* 


*TITL TABLE WITH TWIDDLE FACTORS FOR A 64-POINT FFT 
* 


* FILE TO BE LINKED WITH THE SOURCE CODE FOR A 64-POINT, RADIX-2 FFT. 
* 


-globl SINE 
-globl N 
-globl M 

N -set 64 

M -set 6 
-data 

SINE 
-float 0.000000 
-float 0.098017 
-float 0.195090 
float 0.290285 
-£float 0.382683 
float 0.471397 
float 0.555570 
float 0.634393 
-float 0.707107 
-float 0.773010 
float 0.831470 
-float 0.881921 
-float 0.923880 
-float 0.956940 
-float 0.980785 
-float 0.995185 

COSINE 
-float 1.000000 
float 0.995185 
-float 0.980785 
-float 0.956940 
float 0.923880 
float 0.881921 
-float 0.831470 
-£float 0.773010 
-float 0.707107 
float 0.634393 
float 0.555570 
-float 0.471397 
-float 0.382683 
-float 0.290285 
float 0.195090 
float 0.098017 
-float 0.000000 
-float -0.098017 
-float -0.195090 
efloat -0.290285 
-fFloat -0.382683 
-float -0.471397 
-float -0.555570 
-float -0.634393 
-float -0.707107 
efloat -0.773010 
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float 
efloat 
-£float 
float 
-float 
-float 
-float 
-float 
-float 
-float 
-float 
-float 
float 
float 
-float 
-float 
float 
-float 
-float 
-float 
-float 
-float 
float 
-float 
-float 
-float 
-fLloat 
-float 
-float 
-float 
-float 
-float 
-float 
float 
-float 
-float 
-float 
-float 


=O 
-O 


Htiaittetttpetett ttt ttt td 
OVDDDDODODOVGOOOCOO0CO0OrFC0O00 


OOOO 900000000 
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831470 
881921 


- 923880 
-956940 
-980785 
~995185 
. 000000 
~995185 
- 980785 
- 956940 
- 923880 
-881921 
-831470 
.- 773010 
- 707107 
-634393 
-555570 
~-471397 
. 382683 
- 290285 
- 195090 
-098017 
- 000000 
0.098017 
. 195090 
- 290285 
- 382683 
-471397 
«555970 
-634393 
- 707107 
- 773010 
- 831470 
-881921 


923880 


- 956940 
0.980785 
0.995185 


The radix-2 algorithm has a great tutorial value because it is relatively easy to 
understand how the FFT algorithm functions. However, radix-4 implementa- 
tions can increase the speed of the execution by reducing the overall arith- 
metic required. 
complex, DIF FFT in radix-4. A companion table, like the one in Example 
12-36, should have a value of M equal to the logN, where the base of the lo- 
garithm is four. 


Example 12-37 shows the generic implementation of a 
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Example 12-37. Complex, Radix-4, DIF FFT 


e+e +e He EH HH HHH HHH HSH HF HF HF HF OH 


TITL COMPLEX, RADIX-4, DIF FFT 


GENERIC PROGRAM TO DO A LOOPED-CODE RADIX-4 FFT COMPUTATION IN 
THE TMS320C30. 


THE PROGRAM IS TAKEN FROM THE BURRUS AND PARKS BOOK, P. 117. 
THE (COMPLEX) DATA RESIDE IN INTERNAL MEMORY, AND THE COMPUTATION 
IS DONE IN-PLACE. 


THE TWIDDLE FACTORS ARE SUPPLIED IN A TABLE PUT IN A .DATA SECTION. 
THIS DATA IS INCLUDED IN A SEPARATE FILE TO PRESERVE THE GENERIC 
NATURE OF THE PROGRAM. FOR THE SAME PURPOSE, THE SIZE OF THE 

FFT N AND LOG4(N) ARE DEFINED IN A .GLOBL DIRECTIVE AND SPECIFIED 
DURING LINKING. 


IN ORDER TO HAVE THE FINAL RESULT IN BIT-REVERSED ORDER, THE TWO 

MIDDLE BRANCHES OF THE RADIX-4 BUTTERFLY ARE INTERCHANGED DURING 

STORAGE. NOTE THIS DIFFERENCE WHEN COMPARING WITH THE PROGRAM IN 
P. 117 OF THE BURRUS AND PARKS BOOK. 


-globl FFT ; Entry point for execution 
-globl N ; FFT size 
-globl M ; LOG4(N) 


-globl SINE Address of sine table 


-usect "IN",INP,1024; Memory with input data 


-text 
* INITIALIZE 
TEMP .word $+2 


STORE -word FFTSIZ 


Beginning of temp storage area 


=e 


.word N 


-BSS LPCNT,1 
-BSS JT,1 
-BSS IA1,1 


Second-loop count 
JT counter in program, P. 117 
IA1l index in program, P. 117 


.word M 
-word SINE 
-word INP 
-BSS FFTSIZ,1 ; FFT size 
-BSS LOGFFT,1 : LOG4(FFTSIZ) 
-BSS SINTAB,1 ; Sine/cosine table base 
-BSS INPUT,1 ; Area with input data to process 
~-BSS STAGE,1 ; FFT stage # 
-BSS RPTCNT,1 ; Repeat counter 
-BSS IEINDX,1 ; IE index for sine/cosine 
a 
; 
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FFT: 
* INITIALIZE DATA LOCATIONS 
LDP TEMP ; Command to load data page counter 
LDI @TEMP , ARO 
LDI @STORE,AR1 
LDI *ARO++,RO ; Xfer data from one memory to the other 
STI RO, *AR1++ 
LDI *ARO++,RO 
STI RO, *AR1++ 
LDI *ARO++,RO 
STIL RO, *AR1++ 
LDI *ARO,RO 
STI RO, *AR1 
LDP FFTSIZ ; Command to load data page pointer 
LDI @FFTSIZ,RO 
LDI @FFTSIZ,IRO 
LDI @FFTSIZ,IR1 
LDI 0,AR7 
STI AR7,@STAGE ; @STAGE holds the current stage number 
LSH 1,IRO ; IRO=2*N1 (because of real/imag) 
LSH -2,IR1 ; IR1=N/4, pointer for SIN/COS table 
LDI 1,AR7 
STI AR7,@RPTCNT ; Initialize repeat counter of first loop 
LSH -2,R0 
STI AR7,@IEINDX ; Initialize IE index 
ADDI 2,ROQ 
STI RO,@JT ; JT=RO/2+2 
SUBI 2,R0 
LSH 1,RO ; RO=N2 


* OUTER LOOP 


LOOP : 
LDI @INPUT,ARO ; ARO points to X(I) 
ADDI RO,ARO,AR1 ; AR1 points to X(I1) 
ADDI RO,AR1,AR2 ; AR2 points to X(I2) 
ADDI RO,AR2,AR3 ; AR3 points to X(I3) 
LDI @RPTCNT,RC 
SUBIL 1,RC ; RC should be one less than desired # 


* FIRST LOOP 


RPTB BLK1 
ADDF *+BRRO,*+AR2,R1 
* »; R1=Y(I)+Y(I2) 
ADDF *+AR3,*+AR1,R3 
= >; R3=Y¥(I1)+Y(1I3) 
ADDF R3,R1,R6 ; R6=R1+R3 
SUBF *+AR2,*+ARO,R4 
= ; R4=Y(1I)-Y(1I2) 
STF R6,*+ARO ; Y(I)=R1+R3 
SUBF R3,R1 > R1=R1-R3 
LDF *AR2,R5 ; R5=X(I2) 
| | LDF *+AR1,R7 ; R7=Y(I1) 
ADDF *AR3,*AR1,R3 ; R3=X(I1)+X(1I3) 
ADDF R5,*ARO,R1 s R1=X(I)+X(1I2) 
| | STF R1,*+AR1 ; Y(1I1)=R1-R3 
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ADDF R3,R1,R6 ; R6=R1+R3 
SUBF R5,*ARO,R2 ; R2=X(I)-X(I2) 
| STF R6,*ARO++(IRO) ; X(I)=R1+R3 
SUBF R3,R1 ; R1=R1-R3 
SUBF *AR3,*AR1,R6 ; R6O=X(I1)-X(I3) 
SUBF R7,*+AR3,R3  ; -R3=Y(1I1)-Y(I3) 
| STF R1,*AR1++(IRO) ;X(1I1)=R1-R3 
SUBF R6,R4,R5 ; R5=R4-R6 
ADDF R6,R4 ; R4=R4+R6 
STF R5,*+AR2 ; Y(I2)=R4-R6 
| STF R4,*+AR3 ; Y¥(I3)=R4+R6 
SUBF R3,R2,R5 ; R5=R2-R3 
ADDF R3,R2 ; R2=R2+R3 
BLK1 STF R5,*AR2++(IRO) ; X(1I2)=R2-R3 
| STF R2,*AR3++(IRO) ; X(1I3)=R2+R3 


* IF THIS IS THE LAST STAGE, YOU ARE DONE 


LDI @STAGE , AR7 

ADDI 1,AR7 

CMPTI @LOGFFT,AR7 

BZD END 

STI AR7,@STAGE ; Current FFT stage 


* MAIN INNER LOOP 


LDI 1,AR7 

STI AR7,@IA1 ; Init IA1l index 

LDI 2,AR7 

STI AR7,@LPCNT ; Init loop counter for inner loop 
INLOP: 

LDI 2,AR6 ; Increment inner loop counter 

ADDI @LPCNT,AR6 

LDI @LPCNT,ARO 

LDI @IA1,AR7 

ADDI @IEINDX,AR7 ; IAIL=IA1+IE 

ADDI @INPUT, ARO ; (X(I),Y(I)) pointer 

STI AR7,@IA1 

ADDI RO,ARO,AR1 ; (X(I1),Y(I1)) pointer 

STI AR6 , GLPCNT 

ADDI RO,AR1,AR2 ; (X(I2),¥(I2)) pointer 

ADDI RO,AR2,AR3 ; (X(I3),¥(I3)) pointer 

LDI @RPTCNT,RC 

SUBI 1,RC ; RC should be one less than desired # 

CMPI @JT,AR6 ; If LPCNT=JT, go to 

BZD SPCL 4 special butterfly 

LDI @IA1,AR7 

LDI @IA1,AR4 

ADDI @SINTAB,AR4 ; Create cosine index AR4 

ADDI AR4,AR7,AR5 

SUBI 1,AR5 ; IA2=IA1+IA1-1 

ADDI AR7,AR5,AR6 

SUBI 1,AR6 ; IA3=IA2+IA1-1 
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; SECOND LOOP 


RPTB BLK2 
ADDF *+AR2,*+ARO,R3 
* ; R3=Y(1I)+Y(12) 
ADDF *+AR3,*+AR1,R5 
* > R5=Y(I1)+Y(I3) 
ADDF R5,R3,R6 ; R6=R3+R5 
SUBF *+AR2,*+ARO,R4 
x ; R4=Y(1I)-Y(1I2) 
SUBF R5,R3 ; R3=R3-R5 
ADDF *AR2,*ARO,R1 ; R1=X(1I)+X(I2) 
ADDF *AR3,*AR1,R5 ; R5=X(1I1)+X(I3) 
MPYF R3,*+AR5(IR1) ,R6 R6=R3*CO2 
| | STF R6,*+ARO ; Y(I)=R3+R5 
ADDF R5,R1,R7 ; R7=R1+R5 
SUBF *AR2,*ARO,R2 ; R2=X(I)-X(1I2) 
SUBF R5,R1 ; R1=R1-R5 
MPYF R1,*AR5,R7 ; R7=R1*SI2 
| | STF R7,*ARO++(IRO) ; X(I)=R1+R5 
SUBF R7,R6 ; R6O=R3*CO2-R1*SI2 
SUBF *+AR3,*+AR1,R5 
* ; RS=Y(1I1)-Y(1I3) 
MPYF R1,*+AR5(IR1),R7 ;R7=R1*CO2 
| | STF R6,*+AR1 ; Y(I1)=R3*CO2-R1*SI2 
MPYF R3,*AR5,R6 ; R6=R3*SI2 
ADDF R7,R6 ; R6O=R1*CO2+R3*SI2 
cose sina’ Raceacns 
r ; =R2-R 
SUBF *AR3,*AR1,R5 ; R5=X(1I1)-X(13) 
Suey a ; eee 
ADDF R5,R ; R4=R4+R 
MPYF R3,*+AR4(IR1),R6 ; R6=R3*COl 
| | STF R6,*AR1++(IRO) ; X(I1)=R1*CO2+R3*SI2 
MPYF R1,*AR4,R7 ; R7=R1*SI1 
SUBF R7,R6 ; RG6=R3*CO1-R1*SI1 
MPYF R1,*+AR4(IR1),R6 ; R6=R1*CO1 
| | STF R6,*+AR2 ; Y¥(i2)=R3*CO1-R1*SI1 
MPYF R3,*AR4,R7 ; R7=R3*SI1 
ADDF R7,R6 ; R6=R1*CO1+R3*SI1 
MPYF R4,*+AR6(IR1),R6 ; R6=R4*CO3 
| | STF R6,*AR2++(IRO) ; X(1I2)=R1*CO1+R3*SI1 
MPYF R2,*AR6,R7 ; R7=R2*SI3 
SUBF R7,R6 ; R6=R4*CO3-R2*S13 
MPYF R2,*+AR6(IR1),R6 ; R6=R2*CO3 
| | STF R6,*+AR3 ; y(i3)=R4*CO3-R2*SI3 
MPYF R4,*AR6,R7 ; R7=R4*SI3 
ADDF R7,R6 ; R6=R2*CO3+R4*S13 
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BLK2 
* 


STF 


CMPI 
BP 
BR 


R6, *AR3++(IRO) 

; 
@LPCNT,RO 
INLOP ; 
CONT 


* SPECIAL BUTTERFLY FOR W=J 


SPCL 


BLK3 
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LDI 
LSH 
ADDI 


RPTB 
ADDF 
SUBF 
ADDF 


SUBF 


ADDF 
SUBF 
ADDF 
ADDF 


SUBF 
ADDF 
STF 
Sea 
SUBF 
SUBF 


STF 
STF 
ADDF 
SUBF 
SUBF 
ADDF 
SUBF 
MPYF 
ADDF 
MPYF 
STF 
SUBF 
MPYF 
STF 
ADDF 
MPYF 
STF 
STF 


CMPI 
BPD 


IR1,AR4 
-~1,AR4 
@SINTAB,AR4 ; 


=e 


BLK3 

*AR2,*ARO,RI1 ; 
*AR2,*ARO,R2 ; 
*+AR2,*+ARO,R3 


*+AR2,*+ARO,R4 
; 


*AR3,*AR1,R5 ; 


R1,R5,R6 ; 
R5,R1 : 
*+AR3,*+AR1,R5 
R5,R3,R7 : 
R5,R3 . 
R3,*+ARO ; 


R1, *ARO++(IRO) 
*AR3,*AR1,R1 ; 
*+AR3,*+AR1,R3 

a 
R6,*+AR1 : 
R7, *AR1++(IRO) 
R3,R2,R5 ; 
R2,R3,R2 : 
R1,R4,R3 . 
R1,R4 . 
R5,R3,R1 : 
*AR4,R1 ; 
R5,R3 : 
*R4,R3 ; 
R1,*+AR2 : 
R4,R2,R1 : 
*AR4,R1 ; 
R3, *AR2++(IRO) 
R4,R2 ; 
*AR4,R2 ; 
R1,*+AR3 : 
R2, *AR3++(IRO) 


@LPCNT,RO 
INLOP ; 


x(13)=R2*CO3+R4*SI3 


LOOP BACK TO THE INNER LOOP 


Point to SIN(45) 
Create cosine index AR4=CO0O21 


R1=X(1I)+X(1I2) 
R2=X(I)-X(12) 


R3=Y¥(I)+Y(1I2) 


R4=Y(1I)-Y(I2) 
R5=X(I1)+X(1I3) 
R6=R5-R1 
R1=R1+R5 


R5=Y(1I1)+Y(I3) 
R7=R3-R5 
R3=R3+R5 
Y(I)=R3+R5 

> X(I)=R1+R5 
R1=X(1I1)-X(1I3) 


R3=Y¥(I1)-Y(I3) 
Y(I1)=R5-R1 
> XK(1I1)=R3-R5 
R5=R2+R3 
R2=-R2+R3 

R3=R4-R1 

R4=R4+R1 

R1=R3-R5 

R1=R1*CO21 

R3=R3+R5 

R3=R3*CO21 
Y¥(I2)=(R3-R5)*CO21 
R1=R2~R4 

R1=R1*cO21 
; X(I2)=(R3+R5)*CO21 
R2=R2+R4 

R2=R2*CO21 
Y(1I3)=-(R4-R2) *CO21 
; X(1I3)=(R4+R2)*CO21 


Loop back to the inner loop 
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CONT 


* STORE RESULT 


END: 


[| 
BITRV 
I | 
SELF 


LDI 
LDI 
LSH 


STI 
LSH 
STI 
LDI 
LSH 
ADDI 
STI 
SUBI 
LSH 
BR 


LDI 
SUBI 
LDI 
LDI 
LDI 
LDP 
LDI 


RPTB 
LDF 
LDF 
STF 
STF 


BR 
.end 


@RPTCNT ,AR7 
@IEINDX ,AR6 
2,AR7 


AR7, @RPTCN1' 
2,AR6 

AR6 , @QIEINDX 
RO, LRO 
-3,RO 

2,R0 

RO,@JT 

2,R0 

1,RO 

LOOP 


e 
‘ 


7 


7 


Increment repeat counter for 
next time 


IE=4*1IE 


N1=N2 


IT=N2 /2+2 


N2=N2/4 
Next FFT stage 


OUT USING BIT-REVERSED ADDRESSING 


@FFTSIZ,RC 
1,RC 
@FFTSIZ,IRO 
2,IR1 
@INPUT, ARO 
STORE 
@STORE,AR1 


BITRV 
*+ARO(1),RO 


*ARO++(IRO)<,R1 


RO, *+AR1(1) 


e 
a 
e 
a 


7 


R1,*AR1++(IR1) 


SELF 


, 


RC=N 


RC should be one less than desired # 


TRO=size of FFT=N 


Branch to itself at the end. 


Most often, the data to be transformed is a sequence of real numbers. In this 
case, the FFT demonstrates certain symmetries that permit the reduction of the 


computational load even further. 


Example 12-38 shows the generic imple- 


mentation of a real-valued, radix-2 FFT. For such an FFT, the total number 
of storage required for a length-N transform is only N locations instead of 2N 
that are necessary in a complex FFT. The rest of the points can be recovered 
based on the symmetry conditions. 
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Example 12-38. Real, Radix-2 FFT 


Entry point for executi 
FFT size 
LOG2 (N) 


* 
* TITL REAL, RADIX-2 FFT 
* 
* 
* 
* 
* ISSUE OF THE TRANSACTIONS ON ASSP. 
* 
* THE REAL DATA RESIDE IN INTERNAL MEMORY. 
* 
* THE PROGRAM. 
* 
* 
* 
* THE GENERIC NATURE OF THE PROGRAM. 
* 
* AND SPECIFIED DURING LINKING. 
* N/4 + N/4 = N72. 
* 
* 

-Qlobl FFT ; 

-globl N ; 

-globl M ; 

; 


-globl SINE 
-bss INP,1024 ; 
etext 

* INITIALIZE 

FFTSIZ .word N 


LOGFFT .word M 
SINTAB .word SINE 


INPUT .word INP 

FFT: LDP FFTSIZ : 

* DO THE BIT-REVERSING AT THE 
LDI @FFTSIZ,RC : 
SUBI 1,RC : 
LDI @FFTSIZ,IRO 
LSH -1,IRO ; 
LDI @INPUT,ARO 
LDI @INPUT,AR1 
RPTB BITRV 
CMPI AR1,ARO ; 
BGE CONT ; 
LDF *ARO,RO 

| | LDF *AR1L,R1 
STF RO, *AR1 

| | STF R1,*ARO 
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Address of sine table 


Memory with input data 


Command to load data pa 
BEGINNING 


RC=N 


GENERIC PROGRAM TO DO A RADIX-2 REAL FFT COMPUTATION IN 320C30. 


THE PROGRAM IS TAKEN FROM THE PAPER BY SORENSEN ET AL., JUNE 1987 


THE COMPUTATION IS 
DONE IN-PLACE. THE BIT-REVERSAL IS DONE AT THE BEGINNING OF 


THE TWIDDLE FACTORS ARE SUPPLIED IN A TABLE PUT IN A .DATA 
SECTION. THIS DATA IS INCLUDED IN A SEPARATE FILE TO PRESERVE 
FOR THE SAME PURPOSE, THE 
SIZE OF THE FFT N AND LOG2(N) ARE DEFINED IN A .GLOBL DIRECTIVE 
THE LENGTH OF THE TABLE IS 


on 


ge printer 


RC should be one less than desired # 


TRO=half the size of FF 


Exchange locations only 
if ARO<AR1 


T=N/2 
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CONT 
BITRV 


NOP 
NOP 


*ARO++ 
*AR1++(IRO)B 


*  LENGTH-TWO BUTTERFLIES 


* 


* 
BLK1 
[| 


LDI 
LDI 
SUBI 


RPTB 
ADDF 


SUBF 


STF 
STF 


@INPUT, ARO 
IRO,RC 
1,RC 


BLK1 


*+ARO, *ARO++,RO 


=e 6m NS 


; 
*ARO, *-ARO,R1 


RO, *-ARO 
R1,*ARO++ 


» 
a 
° 
a, 
° 
, 


ARO points to X(I) 
Repeat N/2 times 


RC should be one less than desired # 


RO=X(1)+X(1I+1) 


R1=X(I)-X(I+1) 
X(I)=X(I)+X(I+1) 
X(I+1)=X(1I)-X(1I+4+1) 


* FIRST PASS OF THE DO-20 LOOP (STAGE K=2 IN DO-10 LOOP) 


LOOP 


* INNER LOOP (DO-20 LOOP IN 


INLOP 


LDI 
LDI 
LDI 
LSH 
SUBI 


RPTB 
ADDF 


SUBF 
NEGF 
STF 
STF 


STF 


@INPUT, ARO 
2,IRO 
@FFTSIZ,RC 
-2,RC 
1,RC 


BLK2 


e 
a 
. 
’ 
° 
f 


e 
a 


ARO points to X(I) 
IRO=2=N2 


Repeat N/4 times 


RC should be one less than desired # 


*+ARO(IRO) ,*ARO++(IRO) ,RO 


, 


RO=X(1I)+X(1I+2) 


*ARO,*-ARO(IRO),R1 


*+ARO,RO 
RO, *-ARO(IRO 
R1,*ARO++(IR 


RO, *+ARO 


LOOP (FFT STAGES) 


LDI 
LSH 
LDI 
LDI 
LDI 
LSH 
LSH 
LSH 


LDI 
LDI 
ADDI 
LDI 


LDI 
ADDI 
LDI 


@FFTSIZ,IRO 
-~2,I1RO 

3,,R5 

1,R4 

2,R3 

-1,IRO 

1,R4 

1,R3 


@INPUT,ARS5 
IRO,ARO 
@SINTAB , ARO 
R4,IR1 


AR5,AR1 
1,AR1 
AR1,AR3 


i 
0) 


e 
r 
e 
f 
e 
e 


’ 
’ 


m =e we Ne NP MH UF NO 


R1=X(1I)-X(1I+2) 
RO=-X(I+3) 
X(I)=X(1I)+xX(1I4+2) 


X(I+2)=X(I)-X(1I+2) 
X(I+3)=-X(1I+3) 


IRO=index for E 

R5 holds the current stage number 
R4=N4 

R3=N2 

E=E/2 

N4=2*N4 

N2=2*N2 


THE PROGRAM) 


AR5 points to X(T) 


ARO points to SIN/COS table 
IR1=N4 


AR1 points to X(1I1)=xX(I+J) 
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ADDI R3,AR3 ; AR3 points to X(1I3)=X(I+J+N2) 
LDI AR3,AR2 
SUBI 2,AR2 ; AR2 points to X(I2)=X(I-J+N2) 
ADDI R3,AR2,AR4 ; AR4 points to X(14)=X(I-J+N1) 
LDF *AR5++(IR1),RO 

* >; RO=X(I) 
ADDF *+AR5(IR1),RO,R1 

* ; R1=X(1I)+X(I+N2) 


SUBF RO, *++AR5(IR1) ,RO 
RO=-X(1I)+X(I+N2) 


| STF R1,*-AR5(IR1); X(I)=X(I)+X(I+N2) 
NEGF RO ; RO=X(I)-X(I+N2) 
NEGF *++AR5(IR1),R1 

* ; R1=-X(I+N4+N2) 

| STF RO, *AR5 ; X(I+N2)=X(1I)-X(I+N2) 
STF R1,*ARS5 ; X(I+N4+N2)=-xX (I+N4+N2) 


* INNERMOST LOOP 


LDI @FFTSIZ,IR1 
LSH -~2,IR1 ; IR1=separation between SIN/COS tbls 
LDI R4,RC 
SUBI 2,RC ; Repeat N4-1 times 
RPTB BLK3 
MPYF *AR3,*+ARO(IR1),RO 
* ; RO=X(I3)*COS 
MPYF *AR4,*ARO,R1 ; R1=X(1I4)*SIN 
MPYF *AR4,*+ARO(IR1),R1 ; R1=X(14)*COS 
{| ADDF RO,R1,R2 ; R2=X(1I3)*COS+X(I4) *SIN 
MPYF *AR3,*ARO++(IRO),RO 
* ; RO=X(1I3)*SIN 
SUBF RO,R1,RO ; RO=-X(1I3) *SIN+X(1I4)*COS 
SUBF *AR2,RO,R1 ; R1=-X(1I2)+R0 
ADDF *AR2,RO,R1 ; R1=X(I2)+RO 
| | STF R1,*AR3++ ; X(I3)=-X(I2)+RO 
ADDF *AR1,R2,R1 ; R1=X(1I1)+R2 
| | STF Rl, *AR4-- ; X(14)=X(1I2)+RO 
SUBF R2,*AR1,R1 ; R1=X(I1)-R2 
{ | STF R1,*AR1++ ; X(1I1)=X(I1)+R2 
BLK3 STF R1, *AR2-- >; X(1I2)=X(1I1)-R2 
SUBI @INPUT,ARS 
ADDI R3,AR5 >; AR5S=i+N1 
CMPI @FFTSIZ,ARS5 
BLED INLOP ; Loop back to the inner loop 
ADDI @INPUT,ARS5 
NOP 
NOP 
ADDI 1,R5 
CMPTI @LOGFFT,R5 
BLE LOOP 
END BR END ; Branch to itself at the end. 
.end 
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Table 12-1 summarizes the execution time required for FFT lengths between 
64 and 1024 points for the three algorithms in Examples 12-35, 12-37, and 
12-38. As can be seen, the TMS320C30 permits very fast execution of such 
transforms. FFT lengths up to 1024 points (complex) or 2048 points (real), 
covering the majority of applications, can be executed almost entirely in the 
on-chip memory. 


Table 12-1. TMS320C30 FFT Timing Benchmarks 


NUMBER FFT TIMING 
OF (in milliseconds) 
POINTS RADIX-2 RADIX-4 RADIX-2 
(complex) (complex) (real) 


12.4.5 Lattice Filters 


The lattice form is an alternative way of implementing digital filters, and it has 
found applications in speech processing, spectral estimation, and other areas. 
In the present discussion, the notation and the terminology from speech pro- 
cessing applications will be used. 


If H(z) is the transfer function of a digital filter that has only poles, A(z) = 
1/H(z) will be a filter having only zeros and it will be called the inverse filter. 
The inverse lattice filter is shown in Figure 12-5. In mathematical terms, it is 
described by the equations: 


fin) = fi-1,n) + ki) b(i-1,n-1) 
b(in) = b(i-1,n-1)+ k(/ £(i-1,n) 


Initial conditions: 
f(0,n) = b(0,n) = x(n) 
Final conditions: 


y(n) = f(p.n). 


f(in) ts called the forward error, b(/,n) backward error, k(/)_ is the /-th re- 
flection coefficient, x(n) is the input, and y(n) the output signal. The order 
of the filter (i.e., the number of stages) is p. In the linear predictive coding 
(LPC) method of speech processing, the inverse lattice filter is used during 
analysis, and the (forward) lattice filter during speech synthesis. 
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x(n) (1, n) (2, n) f(p—1, n) fip, n)=y(n) 


Kp 


Kp 


b(p— 1, n) 
Figure 12-5. Structure of the Inverse Lattice Filter 


Figure 12-6 shows the data memory organization of the inverse lattice-filter 
on the TMS320C30. 


reflection backward 
coefficients propagation terms 


low 
@ @ 
e 8 
& @ 


high 
address 


Figure 12-6. Data Memory Organization for Lattice Filters 


12-80 


Software Applications - Application-Oriented Operations 


Example 12-39. Inverse Lattice Filter 
TITL INVERSE LATTICE FILTER 


SUBROUTINE LATINV 


LATINV == LATTICE FILTER (LPC INVERSE FILTER - ANALYSIS) 


TYPICAL CALLING SEQUENCE: 


load R2 
load ARO 
load AR1 
load RC 
CALL LATINV 


ARGUMENT ASSIGNMENTS: 
ARGUMENT | FUNCTION 


Sit tae A es cl Yee se fe car ee a ees Sa in 


R2 | £(0,n) = x(n) 
ARO | ADDRESS OF FILTER COEFFICIENTS k(1)) 
AR1 | ADDRESS OF BACKWARD PROPAGATION 
| VALUES (b(0,n-1)) 
RC | RC =p - 2 


REGISTERS USED AS INPUT: R2, ARO, AR1, RC 

REGISTERS MODIFIED: RO, R1, R2, R3, RS, RE, RC, ARO, ARI 
REGISTER CONTAINING RESULT: R2 (f(p,n)) 

PROGRAM SIZE: 10 WORDS 


EXECUTION CYCLES: 13 + 3 * (p-1) 


ee + + HF Ht HHH HF HF HH HH HH EH He HF HF HF HHH + HF HE HF HF HF H HF 


.global LATINV 
* G=1 


LATINV MPYF3 *ARO, *AR1, RO 
. > k(1) * b(0,n-1) -> RO 


. ; Assume £(0,n) -> R2. 
LDF R2,R3 ; Put b(0,n) = £(0,n) -> R3. 
MPYF3 *ARO++(1),R2,R1 

* >; k(1) * £(0,n) -> R1 
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RPTB LOOP 
MPYF3 *ARO,*++AR1(1),RO ; k(i) * b(i-1,n-1) -> RO 


I | ADDF3 R2,R0,R2 > £(i-1-1,n)+k(i-1) 

;  *b(i-1-1,n-1) 

x ; = f(i-1,n) -> R2 

* 

* >; b(i-1-1,b-1)+k(i-1)*£(i-1-1,n) 
ADDF3 *-AR1(1), R1, R3 ; = b(i-1,n) -> R3 

[| STF R3, *-AR1(1) ; b(i-l-i1,n) -> b(i-1-1,n-1) 

* 

LOOP MPYF3 *ARO++(1),R2,R1 

* ; k(i) * £(i-1,n) -> R1 


* fT = P+1 (CLEANUP) 


ADDF3 R2,R0,R2 p -£(p-1,n) +k (p) *b(p-1,n-=1) 
x ; = £(p,n) -> R2 
* 
* ; b(p-1,n-1)+k(p) *f(p-1,n) 
ADDF35 AR1, Rl, R3 ; == b(p,n) -> R3 
| STF R3, *AR1 ; b(p-1,n) -> b(p-1,n-1) 


| 

* 

* RETURN SEQUENCE 
* 

RETS ; RETURN 
* end 
.end 
The (forward) lattice filter has a structure very similar to the inverse filter, as 


shown in Figure 12-7. The corresponding equations describing the lattice 
filter are: 


f(i-1,n) = fin) - k(i) b(i-1,n-1) 
b(i.n) = b(i-1,n-1) + k(i) F( i-1,n) 


Initial conditions: 

f(p,n) = x(n), b(in-1) = 0 for /=1,...,) 
Final conditions: 

y(n) = F(0,n). 


The data memory organization is identical to the one of the inverse filter, as 
shown in Figure 12-6. Example 12-40 is the implementation of the lattice 
filter on the TMS320C30. 
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x(n) =f(p, n) f(2,n) 


Kp 


Figure 12-7. Structure of the (Forward) Lattice Filter 
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Example 12-40. Lattice Filter 


* TITL LATTICE FILTER 

* 

* 

* SUBROUTINE LA TICE 

* 

* LOAD ARO 

* LOAD AR1 

* LOAD RC 

* CALL LATICE 

* 

* 

* ARGUMENT ASSIGNMENTS: 

* ARGUMENT | FUNCTIONfunction 

Eo ooo me me me me me ee ee 

* R2 | F(P,N) = E(N) = EXCITATION 

* ARO | ADDRESS OF FILTER COEFFICIENTS (K(P)) 
* AR1 | ADDRESS OF BACKWARD PROPAGATION VALUES (B(P-1,N-1)) 
* RC | RC =P - 2 

* 

* REGISTERS USED AS INPUT: R2, ARO, AR1, RC 

* REGISTERS MODIFIED: RO, R1, R2, R3, RS, RE, RC, ARO, AR1 
* REGISTER CONTAINING RESULT: R2 (£(0,n)) 

* 

* STACK USAGE: NONE 

* 

* PROGRAM SIZE: 12 WORDS 

* 

* EXECUTION CYCLES: 13 + 5 * (P-1) 

*k 

* 


-global LATICE 


LATICE MPYF3 *ARO, *AR1, RO 
. > K(P) * B(P-1,N-1) -> RO 


SUBF3 RO,R2,R2 ASSUME F(P,N) -> R2. 


* ; F(P,N)-K(P)*B(P-1,N-1) 
* ; =F(P-1,N) -> R2 
* 
* 2 <= I <= Pp 
* 
RPTB LOOP 
MPYF3 *ARO,R2,R1 ; K(I) * F(I-1,N) -> RI 
MPYF3 *--ARO(1), *-AR1(1), RO 
* >; K(I-1) * B(I-1-1,N-1) -> R 
ADDF3 *ARI1--(1), Rl, R3 
* > B(I-1,N-1) + K(I) * F(I-1,N) 
; = B(I,N) -> R3 
STF R3, *+AR1(2) ; B(I,N) -> B(I,N-1) 
SUBF3 RO,R2,R2 ; F(I-1,N)-K(1I-1) *B(I-1-1,N-1) 
; = F(I-1-1,N) -> R2 
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* I = 1 (CLEANUP) 


MPYF3 *ARO, R2, Rl K(1) * F(O,N) -> R1 


ADDF3 *AR1, R1, R3 ; B(O,N-1) + K(1) * F(O,N) 
. ; = B(1,N) -> R3 
STF R3, *+AR1(1) ; B(1,N) -> B(1,N-1) 
| STF R2, *AR1 ; F(O,N) -> B(0O,N-1) 


| 
* 
* RETURN SEQUENCE 
* 

RETS >; RETURN 
* end 


.end 
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12.5 Programming Tips 


Programming style is highly personal, and reflects each individual's prefer- 
ences and experiences. The purpose of this section is not to impose any par- 
ticular style. Instead, it intends to emphasize some of the features of the 
TMS320C30 that can help in producing faster and/or shorter programs. The 
following covers both C compiler and assembly language programming. 


12.5.1 C-Callable Routines 


The TMS320C30 was designed with a high-level language (HLL) in mind. 
The large register file, the software stack and the large memory space makes 
implementation of a HLL compiler an easy task. The first such implementation 
supplied is a C compiler. Use of the C compiler increases the transportability 
of applications that have been tested on large, general-purpose computers, 
and decreases their porting time. 


For best usage of the compiler: 


1) Write the application in the high-level language. 

2) Debug the program. 

3) Estimate if it runs in real-time. 

4) If not, identify places where most of the execution time is spent. 

5) Optimize these areas by writing assembly language routines imple- 
menting the functions. 

6) Call the routines from the C program as C functions. 


When writing a C program, a simple way to increase the execution speed is 
to maximize the use of register variables. For more information, refer to the 
TMS320C30 C Compiler Reference Guide. 


There are certain conventions that have to be observed in writing a C-callable 
routine. These conventions are outlined in the Runtime Environment chapter 
of the “TMS320C30 C Compiler Reference Guide”. Certain registers are saved 
by the calling function and others need to be saved by the called function. 
The C compiler manual will help achieve a very clean interface. The end result 
is the readability and natural flow of a high-level language combined with the 
efficiency and special-feature use of assembly language. 


12.5.2 Hints for Assembly Coding 
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Each program will have its particular requirements. Not all possible optimiza- 
tions will make sense in every case. The suggestions presented in this section 
can be used as a checklist of available software tools. 


@ Use delayed branches. Delayed branches take a single cycle to ex- 
ecute, regular branches take four. The following three instructions are 
also executed no matter if the branch is taken or not. If there are less 
than three instructions that could be used, use the delayed branch and 
append NOPs. Machine cycles (time) are still being saved. 


@ Apply the repeat single/block construct. In this way, loops are 
achieved with no overhead. Nesting such constructs normally will not 
increase efficiency, so try to use the repeat feature on the most often 
performed loop. Note that RPTS is not interruptible, and the executed 


Software Applications - Programming Tips 


instruction is not refetched for execution. This frees the buses for op- 
erands. 


@ Use parallel instructions. It is possible to have a multiply in parallel 
with an add (or subtract), and stores in parallel with any multiply or 
ALU operation. This increases the number of operations executed in a 
single cycle. For maximum efficiency, observe the addressing modes 
used in parallel instructions and arrange the data appropriately. 


® Maximize the use of registers. The registers are a very efficient, 
easy way to access scratch-pad memory. Extensive use of the register 
file will also help when using parallel instructions and in avoiding the 
pipeline conflicts encountered using the registers in addressing modes. 


e Use the cache. Especially in conjunction with external slow memory. 
The cache is transparent to the user, so make sure that it is enabled. 


e Use internal memory instead of external memory. The internal 
memory (2K x 32 bits RAM and 4K x 32 bits ROM) is considerably 
faster to access. In a single cycle, two operands can be brought from 
internal memory. A way of maximizing performance is to use the DMA 
in parallel with the CPU to transfer data to internal memory before op- 
erating on them. 


® Avoid pipeline conflicts. If there is no problem with program speed, 
ignore this suggestion. For time-critical operations, make sure that cy- 
cles are not missed because of conflicts. The way to identify such 
conflicts is to run the trace function on the development tools (simula- 
tor, emulators) with the program tracing option enabled. The tracing 
will identify immediately the pipeline conflicts. Consulting the appro- 
priate section of this User’s Guide will explain the reason for the con- 
flict. Steps can then be taken to correct the problem. 


The above checklist is not exhaustive, and it does not address the more de- 
tailed features outlined in the different sections of this manual. To exploit the 
full power of the TMS320C30 it is recommended that the architecture, hard- 
ware configuration, and instruction set of the device, described in earlier 
chapters, be carefully studied. 
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Section 13 


Hardware Applications 


The TMS320C30’s advanced interface design allows this device to be used to 
implement a wide variety of system configurations. Its two external buses and 
DMA capability provide a parallel 32-bit interface to byte- or word-wide de- 
vices, while the interrupt interface, dual serial ports, and general purpose dig- 
ital 1/O provide communication with a multitude of peripherals. 


This section describes how to use the TMS320C30’s interfaces to connect to 
various external devices. Specific discussions include implementation of par- 
allel interface to devices with and without wait states, use of DMA and general 
purpose |/O, and multiprocessing considerations. 


Major topics discussed in this section are as follows: 


@ System Configuration Options Overview (Section 13.1 on page 13-2) 


®@ Primary Bus Interface (Section 13.2 on page 13-4) 
— Zero Wait State Interface to RAMs 
= Ready Generation 
- Bank Switching Techniques 


® Expansion Bus Interface (Section 13.3 on page 13-14) 


@ System Control Functions (Section 13.4 on page 13-18) 
= Clock Oscillator Circuitry 
— Reset Signal Generator 


®@ User Target Design Considerations When Using the XDS1000 (Section 
13.5 on page 13-22) 
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13.1 System Configuration Options Overview 


The various TMS320C30 interfaces allow connections to a wide variety of 
different device types. Each of these interfaces is tailored to a particular family 


of devices. 


13.1.1 Categories of Interfaces on the TMS320C30 


The interface types on the TMS320C30 fall into several different categories 
depending on the devices to which they were intended to be connected. Each 
interface comprises one or more signal lines which transfer information and 
control its operation. Shown in Figure 13-1 are the signal line groupings for 
each of these various interfaces. 
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Bus 


Control 
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DMA < 
Interface 
Interrupt 
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External 
Flags 


System Reset 
ROM Enable 


System Master 
Control Clock 
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Figure 13-1. 
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External Interfaces on the TMS320C30 


All of the interfaces are independent of one another and different operations 


may be performed simultaneously on each interface. 


The Primary and Expansion buses implement the memory mapped interface to 
the device. The external DMA interface allows external devices to cause the 
processor to relinquish the Primary bus and allow direct memory access. 
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13.1.2 Typical System Block Diagram 


Memory 


Peripherals 


Peripherals 


Bit 1/O 


The devices which can be interfaced to the TMS320C30 include memory, 
DMA devices, and numerous parallel and serial peripherals and I/O devices. 
Figure 13-2 illustrates a typical configuration of a TMS320C30 system 
showing different types of external devices and the interfaces to which they 
are connected. 


DMA 


; Memory 
Devices 
External Peripherals 
DMA 

interface 

Primary Expansion 

B 
= Tms3z0c30.~—s S 
Interrupt Timer 1/0 
Interface Interface Devices 


External 
Flags 

System Serial 

Control Ports 


TCM29C13 
CODEC 


Clock and TLC32040 
Reset AIC 
Generators, etc. Analog !/O 


Figure 13-2. Possible System Configurations 


This block diagram constitutes more or less a fully expanded system. In an 
actual design any subset of the illustrated configuration may be used. 
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13.2 Primary Bus Interface 


The primary bus is used by the TMS320C30 to access the majority of its me- 
mory mapped locations. Therefore, typically when a large amount of external 
memory is required in a system, it is interfaced to the primary bus. The ex- 
pansion bus (discussed in the next subsection) actually comprises two mu- 
tually exclusive interfaces, controlled by the MSTRB and IOSTRB signals 
respectively. Cycles on the expansion bus controlled by the MSTRB signal are 
identical in timing to cycles on the primary bus, with the exception that bank 
switching is not implemented on the expansion bus. Accordingly, the dis- 
cussion of primary bus cycles in this section applies equally to MSTRB_ cycles 
on the expansion bus. 


Although both the primary bus and the expansion bus may be used to inter- 
face to a wide variety of devices, the devices most commonly interfaced to 
these buses are memories. Therefore, detailed examples of memory interface 
will presented in this subsection. 


13.2.1 Zero Wait-State Interface To RAMs 
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For full speed, zero wait-state interface to any devices, the TMS320C30 re- 
quires a read access time of 35 ns from address stable to data valid. Since, for 
most memories, access time from chip select is the same as access time from 
address, it is theoretically possible to use 35 ns memories at full speed with 
the TMS320C30. This, however, dictates that there be no delays present be- 
tween the processor and the memories. This is usually not the case in practice, 
due to interconnection delays and the fact that typically some gating is re- 
quired for chip select generation. Therefore, slightly faster memories are gen- 
erally required in most systems. If one level of reasonably high-speed (below 
10 ns in propagation delay) gating is used to generate chip select for the me- 
mories, 25 ns devices may be used. 


Among currently available RAMs, there are two distinct categories of devices 
with different interface characteristics. These two categories are RAMs with- 
out output enable control lines (OE), which include the 1-bit wide organized 
RAMs and most of the 4-bit wide RAMs, and those with OE controls, which 
include the byte wide and a few of the 4-bit wide RAMs. Many of the fastest . 
RAMs do not provide OE control, and use chip select (CS) controlled write 
cycles to insure that data outputs do not turn on for write operations. In CS 
controlled write cycles, the write control line (WE) goes low prior to CS going 
low, and internal logic holds the outputs disabled until the cycle is completed. 
Using CS controlled write cycles is an efficient way to interface fast RAMs 
without OE controls to the TMS320C30 at full speed. 


Figure 13-3 shows the TMS320C30 interfaced to Cypress Semiconductor's 
CY7C164 25 ns 16k x 4-bit CMOS static RAMs with zero wait states using 
CS controlled write cycles. These RAMs are arranged to implement 16k 32-bit 
words located at addresses QOOOOH thru O3FFFH, which are the first 16k 
words in external memory. Note that in Figure 13-3 the RDY input is tied low, 
selecting zero wait states for all accesses on the bus. 
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TMS320C30 


Cypress 
CY7C164-25 


Figure 13-3. Ram Interface - No OE 


In this circuit, chip select is generated from STRB and A23 using a 74AS32, 
whose propagation delay is only 5.8 ns. Thus, the chip select delay added to 
the RAM’s 25 ns chip select access time satisfies the TMS320C30's 35 ns read 
access time from address. This approach works well if only a single bank of 
external memory is implemented where the chip select decode can be accom- 
plished in only one level of gating. If more than one bank is required to im- 
plement very large memory spaces, bank switching can be used to provide for 
multiple bank select generation while still maintaining full speed accesses 
within each bank. Bank switching is discussed in detail in a later subsection. 


13-5 


Hardware Applications - Primary Bus Interface 


| 
be tg a 


Figure 13-4. Interface Read Timing 
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Figure 13-5. Interface Write Timing 
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Figures 13-4 and 13-5 show the read and write timings of this interface, re- 
spectively. For read operations, WE (R/W) is inactive (high), and the device 
is selected whenever both STRB and A23 are low. The total time from address 
to data from the RAM is therefore: 


tacc = ti + t2 = 5.8 + 25 = 30.8 ns 


This easily meets the TMS320C30’s 35 ns access time requirement. For write 
operations, address and R/W change state far enough away in time from the 
low STRB_ pulse to allow this interface to easily meet specifications for most 
RAMs’ CS controlled write cycles. In this case, the CY7C164s outputs disable 
at the beginning of the cycle well early enough (t1=7 ns) to avoid bus con- 
tention with the TMS320C30. Data is then driven into the RAMs as STRB 
goes low. The RAMs require 13 ns of write data setup prior to CS going high, 
and this design provides around 65 ns (t2). A data hold time of O ns (t3) is 
required by the RAMs, and this design provides greater than 10 ns. Finally, 
the RAMs setup and hold times for address with respect to CS of 0 ns are also 
met with a clear margin. 


Some RAMs with OE controls can also use CS controlled write cycles and this 
interface may be used with some of these devices with OE tied low. There 
are, however, two requirements for the use of OE RAMs with this interface. 
First, the RAM’s OE input must be gated with chip select and WE _ internally 
so that the device’s outputs do not turn on unless a read is being performed. 
Secondly, the RAM must allow address inputs to change while WE is low, 
which some RAMs specifically prohibit. 


Many RAMs with OE controls that do not meet the design criteria for the cir- 
cuit shown in Figure 13-3 may be interfaced to the TMS320C30 using the 
approach shown in Figure 13-6 


TMS320C30 


IDT7 198-25 


Figure 13-6. RAM Interface - OE 
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This design shows an interface to Integrated Device Technology’s IDT7198 
25 ns 16k x 4-bit CMOS static RAMs using OE to enable and disable the data 
outputs. 


In this circuit, chip select is driven directly by a single address line, which lo- 
cates the RAM at addresses OOQQOOH through O3FFFH in external memory. 
The RAM’s WE input is generated by ANDing R/W and STRB, and therefore 
WE goes low after CS only during write cycles. This satisfies the RAM’s re- 
quirement that address never changes when WE is low. 


The timing of read operations, shown in Figure 13-7, is very straightforward 
since CS is driven directly. The read access time of the circuit, t1, is therefore 
simply the RAM's chip select/address access time, which is 25 ns. This pro- 
vides 10 ns of margin over the TMS320C30’'s 35 ns requirement. 


er nee 


Figure 13-7. Read Operations Timing 
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Figure 13-8. Write Operations Timing 


During write operations, as shown in Figure 13-8, the RAM’s outputs are 
disabled after a delay of t1 following R/W going low. This delay comprises 
the inverter propagation delay and the RAM’s turn-off delay, therefore t1 is 
given by: 


11 =5+15 = 20ns 


which results in the outputs being disabled no later than the falling edge of 
H1, thereby avoiding bus contention with the TMS320C30. The circuit's data 
setup and hold times of approximately 65 and 10 ns, respectively also easily 
meet the RAM’s timing requirements. 


As with the circuit of Figure 13-3, if more complex chip select decode is re- 
quired than can be accomplished in time to meet zero wait state timing, wait 
states or bank switching techniques (discussed in a later subsection) should 
be used. 


It should be noted that the 1DT7198’s OE control is gated with CS internally, 
therefore the RAM’s outputs are not enabled unless the device is selected. 
This ts critical if there are any other devices connected to the same bus; if there 
are no other devices connected to the bus, then OE need not be gated inter- 
nally with chip select. 
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13.2.2 Ready Generation 
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The use of wait states can greatly increase system flexibility and reduce hard- 
ware requirements over systems without wait state capability. The 
TMS320C30 has the capability of generating wait states on either the primary 
bus or the expansion bus; both buses have independent sets of ready control 
logic. Ready generation is discussed in this subsection from the perspective 
of the primary bus interface, however, wait state operation on the expansion 
bus is identical to that of the primary bus, therefore these discussions pertain 
equally well to expansion bus operation. Ready generation will not be in- 
cluded in the specific discussions of the expansion bus interface. 


Wait states are generated on the basis of the internal wait state generator, the 
external ready input (RDY), or the logical AND or OR of the two (see Section 
8.3). When enabled, internally generated wait states effect all external cycles, 
regardless of the address accessed. If different numbers of wait states are re- 
quired for various external devices, the external RDY input may be used to 
tailor wait state generation to specific system requirements. 


If the logical OR (or electrical AND since the signals are low true) of the ex- 
ternal and wait count ready signals is selected, the earlier of either of the two 
signals will generate a ready condition and allow the cycle to be completed. 
It is not required that both signals be present. 


The OR of the two ready signals can be used to implement wait states for 
devices which require a greater number of wait states than are implemented 
with external logic (up to eight). This feature is useful, for example, if a sys- 
tem contains some fast and some slow devices. In this case, fast devices can 
generate ready externally with a minimum of logic, and slow devices can use 
the internal wait counter for larger numbers of wait states. Thus, when fast 
devices are accessed, the external hardware responds promptly with ready 
which terminates the cycle. When slow devices are accessed, the external 
hardware does not respond, and the cycle is appropriately terminated after the 
internal wait count. 


The OR of the two ready signals may also be used if conditions occur which 
require termination of bus cycles prior to the number of wait states imple- 
mented with external logic. In this case, a shorter wait count is specified in- 
ternally than the number of wait states implemented with the external ready 
logic, and the bus cycle is terminated after the wait count. This feature may 
also be used as a safeguard against inadvertent accesses to nonexistent me- 
mory which would never respond with ready and therefore lock up the 
TMS320C30. 


If the OR of the two ready signals is used, however, and the internal wait state 
count is less than the number of wait states implemented externally, the ex- 
ternal ready generation logic must have the ability to reset its sequencing to 
allow a new cycle to begin immediately following the end of the internal wait 
count. This requires that, under these conditions, consecutive cycles must be 
from independently decoded areas of memory and that the external ready 
generation logic be capable of restarting its sequence as soon as a new cycle 
begins. Otherwise, the external ready generation logic may loose synchroni- 
zation with bus cycles and therefore generate improperly timed wait states. 
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If the logical AND (electrical OR) of the wait count and external ready signals 
is selected, the later of the two signals will control the internal ready signal, 
but both signals must occur. Accordingly, external ready control must be im- 
plemented for each wait state device in addition to the wait count ready signal 
being enabled. 


This feature is useful if there are devices in a system which are equipped to 
provide a ready signal but cannot respond quickly enough to meet the 
TMS320C30’s timing requirements. In particular, if these devices normally 
indicate a ready condition and, when accessed, respond with a wait until they 
become ready, the logical AND of the two ready signals can be used to save 
hardware in the system. In this case, the internal wait counter can be used to 
provide wait states initially, and become ready after the external device has 
had time to send a not ready indication. The internal wait counter then re- 
mains ready until the external device also becomes ready, which terminates the 
cycle. 


Additionally, the AND of the two ready signals may be used for extending the 
number of wait states for devices which already have external ready logic im- 
plemented but require additional wait states under certain unique circum- 
stances. 


In the implementation of external ready generation hardware, the particular 
technique employed depends heavily on the specific characteristics of the 
system. The optimum approach to ready generation varies depending on the 
relative number of wait state and non-wait state devices in the system and the 
maximum number of wait states required for any one device. The approaches 
discussed here are intended to be general enough for most applications, and 
are easily modifiable to comprehend many different system configurations. 


In general, ready generation involves the following three functions: 


1) Segmentation of the address space in some fashion to distinguish fast 
and slow devices. 


2) Generate properly timed ready indications. 


3) Logically ORing all of the separate ready timing signals together to 
connect to the physical ready input. 


Segmentation of the address space is required so that a unique indication of 
each of the particular areas within the address space that require wait states 
can be obtained. This segmentation is commonly implemented in a system in 
the form of chip select generation. Chip select signals may be used to initiate 
wait states in many cases, however, occasionally chip select decoding con- 
siderations may provide signals which will not allow ready input timing re- 
quirements to be met. In this case, coarse address space segmentation may 
be made on the basis of a small number of address lines, where simpler gating 
allows signals to be generated more quickly. In either case, the signal indi- 
cating that a particular area of memory is being addressed is normally used to 
initiate a ready or wait state indication. 


Once the region of address space being accessed has been established, a 
timing circuit of some sort is normally used to provide a ready indication to the 
processor at the appropriate point in the cycle to satisfy each device’s unique 
requirements. 
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Finally, since indications of ready status from multiple devices are typically 
present, an OR gate is commonly used to combine the signals to drive the RDY 
input. 


One of two basic approaches may be taken in the implementation of ready 
control logic depending upon the state in which the ready input is to be be- 
tween accesses. If RDY is low between accesses, the processor is always 
ready unless a wait state is required; if RDY is high between accesses, the 
processor will always enter a wait state unless a ready indication is generated. 


If RDY is low between accesses, control of full speed devices is straightfor- 
ward; no action is necessary since ready Is always active unless otherwise re- 
quired. Devices requiring wait states, however, must drive ready high fast 
enough to meet the input timing requirements. Then, after an appropriate de- 
lay, a ready indication must be generated. This can be quite difficult in many 
circumstances since wait state devices are inherently slow and often require 
complex select decoding. 


If RDY is high between accesses, zero wait state devices, which tend to be 
inherently fast, can usually respond immediately with a ready indication. Wait 
state devices may simply delay their select signals appropriately to generate a 
ready. Typically, this approach results in the most efficient implementation of 
ready control logic. Figure 13-9 shows a circuit of this type which can be used 
to generate O, 1, or 2 wait states for multiple devices in a system. 


Hardware Applications - Primary Bus Interface 


74ALS138 


TMS320C30 
Address 
Bus 


STRB Device 
Selects 
\/ 
1 Wait +5 V 
zavent State O Wait 
State Devices Devices 4.7kO State Devices 


74AS21 
74AS20 0 


748114 


ety 
R 


Figure 13-9. Circuit For Generation of 0, 1, or 2 Wait States For 
Multiple Devices 


In this circuit, full speed devices drive ready directly through the 74AS21, and 
the two flip-flops delay wait state devices’ select signals one or two H1 cycles 
to provide 1 or 2 wait states. 


Considering the TMS320C30’s ready delay time of 8 ns following address, 
zero wait state devices must use ungated address lines directly to drive the 
input of the 74AS21, since this gate contributes a maximum propagation delay 
of 6 ns to the RDY signal. Thus, zero wait state devices should be grouped 
together within a coarse segmentation of address space if other devices in the 
system require wait states. 


With this circuit, devices requiring wait states may take up to 42 ns from a 13 
valid address on the TMS320C30 to provide inputs to the 74AS20’s inputs. 
Typically, this allows sufficient time for any decoding required in generating 
select signals for slower devices in the system. For example, the 74ALS138 
driven by address and STRB, can generate select decodes in 22 ns, which ea- 

sily meets the TMS320C30’s timing requirements. 
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With this circuit, unused inputs to either the 74AS20s or the 74AS21 should 
be tied to a logic high level to prevent noise from generating spurious wait 
states. 


If more than 2 wait states are required by devices within a system, other ap- 
proaches may be employed for ready generation. If between three and eight 
wait states are required, additional flip-flops may be included, in the same 
manner as shown in Figure 13-9, or internally generated wait states may be 
used in conjunction with external hardware. If greater than eight wait states 
are required, an external circuit using a counter may be used to supplement the 
internal wait state generators capabilities. 


13.2.3 Bank Switching Techniques 
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The TMS320C30’s programmable bank switching feature can greatly ease 
system design when large amounts of memory are required. This feature is 
used to provide a period of time during which all device selects are disabled 
that would not normally be present otherwise (refer to Section 8.4 for further 
information regarding bank switching). During this interval, slow devices are 
allowed time to turn off before other devices have the opportunity to drive the 
data bus, thus avoiding bus contention. 


When bank switching is enabled, any time a portion of the high order address 
lines change, as defined by the contents of the BNKCMPR register, STRB goes 
high for one full H1 cycle. Provided STRB is included in chip select decodes, 
this causes all devices to be disabled during this period. The next bank of 
devices is not enabled until STRB goes low again. 


Bank switching is not required during writes since these cycles always exhibit 
an inherent one-half H1 cycle setup of address information before STRB goes 
low. Thus, when using bank switching for read/write devices, a minimum of 
half of one H1 cycle of address setup is provided for all accesses. Therefore, 
large amounts of memory can be implemented without wait states or extra 
hardware required for isolation between banks. Also, note that access time for 
cycles during bank switching is the same as that of cycles without bank 
switching, and accordingly, full speed accesses may still be accomplished 
within each bank. 


The circuit shown in Figure 13-10 illustrates the use of bank switching with 
Cypress Semiconductor's CY7C185 25 ns 8kx8 CMOS static RAM. This cir- 
cuit implements 32k 32-bit words of memory with full speed zero wait state 
accesses within each bank. 
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Figure 13-10. Bank Switching For Cyprus Semiconductors CY7C185 


Each of the four banks in this circuit is selected using a decode of A15-A13 
generated by the 74AS138. With the BNKCMPR register set to >OBh, the 
banks will be selected on even 8k word boundaries starting at location zero 
in external memory space. 


This circuit could not have been implemented without bank switching, since 
data output’s turn-on and turn-off delays would have caused bus conflicts, 
and full speed accesses do not allow enough time for chip select decoding for 
the four banks. Here, the propagation delay of the 74AS138 is only involved 
during bank switches, where there is sufficient time between cycles to allow 
new chip selects to be decoded. 


The timing of this circuit for read operations using bank switching is shown 
in Figure 13-11 With the BNKCMPR register set to >OBh, when a bank 
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switch occurs, the bank address on address lines A23-A13, is updated during 
the extra H1 cycle while STRB is high. Then, after chip select decodes have 
stabilized, and the previously selected bank has disabled its outputs, STRB 
goes low for the next read cycle. Further accesses occur at full speed with the 
normal bus timings, as long as another bank switch is not necessary. Write 
cycles do not require bank switching due to the inherent address setup pro- 
vided in their timings. 
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Figure 13-11. Timing For Read Operations Using Bank Switching 


This timing is summarized in Table 13-1. 


Table 13-1. Bank Switching Interface Timing 


Time Time 
Interval Period 


H1 falling to address/STRB valid /10ns 
STRB to select delay [45ns 
Memory disable from select | 15ns 


| 4 | H1 falling to STRB [tons 


STRB to select delay [45s 
Memory output enable delay | 
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13.3 Expansion Bus Interface 


The TMS320C30s expansion bus interface provides a second complete paral - 
lel bus which can be used to implement data transfers concurrently with, and 
independent of, operations on the primary bus. The expansion bus comprises 
two mutually exclusive interfaces controlled by the MSTRB and IOSTRB sig- 
nals, respectively. These two Signals are activated depending on what section 
of the memory space is accessed. This subsection discusses interface to the 
expansion bus using IOSTRB ; MSTRB cycles are identical in timing to primary 
bus cycles, and are discussed in Section 13.2. 


Unlike thre primary bus, both read and write cycles on the I/O portion of the 
expansion bus are two H1 cycles in duration and exhibit the same timing. Thr 
XR/W signal is high for reads and low for writes. Since I1/O accesses take two 
cycles, many peripherals that require wait states if interfaced either to the pri- 
mary bus or using MSTRB may be used in a system without the need for wait 
states. Specifically, any devices with address access times greater than the 
35 ns required by the primary bus but not less than 46 ns can be interfaced 
to the I/O bus without wait states. 


A/D converters are one common DSP system component which often falls 
into this category. These devices are available in many speed ranges and with 
a variety of features, and while some may require one or more wait states on 
the I/O bus, others may be used at full speed. 


One A/D converter that interfaces to the 1/O bus without wait states and re- 
quires minimal additional logic is the ad 1332 from Analog Devices. Figure 
13-12 illustrates an interface to this device. 
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Figure 13-12. Expansion Bus Interface to A/D Converter 


The interface uses a 74ALS138 to decode chip select for the converter. This 
configuration is shown assuming that other peripheral devices in the system 
also require chip select decodes. XA(8-10) are decoded to locate the con- 
verter at address 0804000h, which is the beginning of the I/O address space. 
Other peripherals may also use the outputs of the decoder, which generates 
chip selects in the |/O address space on 256 word boundaries. 


XAO is used to drive the single address line required in interfacing to the con- 
verter. This input selects between an internal 32-word FIFO buffer and the 
A/D’s control/status register. Thus, the FIFO is located at address 0804000h 
and the control/status register is located at address 0804001h. 
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Since the converter requires RD and WR control signals rather than WE and 
OE, random logic is used to generate these signals from IOSTRB and XR/W. 
The converter’s IRO (Interrupt Request) output is used to alert the 
TMS320C30 to various conditions of converter status. 


Figure 13-13 shows the timing for read and write operations between the 
TMS320C30 and the AD1332. Both operations are shown on the same tim- 
ing diagram since, unlike the primary bus, only data bus timing and the state 
of XR/W differ between the two different types of cycles. 


: 
c 


WRITE DATA VALID 


Figure 13-13. Timing of Expansion Bus Interface 


In both cases, address and R/W are valid t1 = 10 ns after the falling edge of 

Hi. After t2 = 17 ns, the propagation delay of the 74ALS138, the A/D con- 
verter’s chip select goes low, selecting the device. Then, t3 = 10 ns after the 

rising edge of H1, IOSTRB goes low, and t4 = 5.8 ns following this, the RD or 
WR signal to the converter goes low, initiating either a read or write cycle, re- 
spectively. 


For a read operation, the A/D converter provides data back to the TMS320C30 
t4 + th = 30.8 ns after RD goes low. This satisfies the TMS320C30’s re- 
quirement of having data valid 35 ns after |OSTRB. For write operations, the 
A/D converter requires less than 5 ns of data setup and hold time with respect 
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to the rising edge of WR. This is met with a high degree of margin by the 
TMS320C30. 


It should be noted that for the AD1332’s FIFO to be clocked properly, the RD 
signal must go high between accesses to the device. Therefore, although the 
AD1332 may be fast enough in some cases to be used at speeds approaching 
those of the primary bus, the STRB signal on the primary bus stays low for 
multiple consecutive read cycles. The !/O bus, therefore, is the preferable 
choice for interface to this device. 


13-20 


Hardware Applications - System Control Functions 


13.4 System Control Functions 


There are several aspects of TMS320C30 system hardware design which are 
Critical to overall system operation. These include such functions as clock and 
reset signal generation and interrupt control. 


13.4.1 Clock Oscillator Circuitry 


An input clock may be provided to the TMS320C30 either from an external 
clock input or by using the on-board oscillator. Unless special clock require- 
ments exist, using the on-board oscillator is generally a convenient method 
of clock generation. This method requires few external components and can 
provide stable, reliable clock generation for the device. 


Figure 13-14 shows a clock generator circuit using the internal oscillator. This 
circuit is designed to operate at 33.33 MHz and since crystals with funda- 
mental oscillation frequencies of 30 MHz and above are not readily available, 
a parallel-resonant third-overtone circuit is used. 


TMS320C30 


X1 X2/CLKIN 


33.33 MHz 


47 pF 0.1 uF 20 pF 


Figure 13-14. Crystal Oscillator Circuit 


In a third-overtone oscillator, the crystal fundamental frequency must be at- 
tenuated so that oscillation is at the third harmonic. This is achieved with an 
LC circuit that filters out the fundamental, thus allowing oscillation at the third 
harmonic. The impedance of the LC network must be inductive at the crystal 
fundamental and capacitive at the third harmonic. The impedance of the LC 
circuit is given by: 
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Therefore, the LC circuit has a pole at: 
1 
wp = ¥ LC | (4) 
At frequencies significantly lower than wp, the 1/(wC) term in (3) becomes 
the dominating term, while wl can be neglected. This gives: 
z(w) =jwL for w << Wp (5) 


In (5), the LC circuit appears inductive at frequencies lower than wy. On the 
other hand, at frequencies much higher than wp, the wL term is the dominant 
term in (3), and 1/(wC) can be neglected. This gives: 


1 
z(w) = jwC for w >>wp (6) 


The LC circuit in (6) appears increasingly capacitive as frequency increases 
above wp. This is shown in Figure 13-15, which is a plot of the magnitude 
of the impedance of the LC circuit of Figure 13-14 versus frequency. 
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Figure 13-15. Magnitude of the Impedance of the Oscillator LC 
Network 
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Based on the discussion above, the design of the LC circuit proceeds as fol- 
lows: 


1) | Choose the pole frequency wp, approximately halfway between the crys- 
tal fundamental and the third harmonic. 


2) The circuit now appears inductive at the fundamental frequency and 
capacitive at the third harmonic. 


In the oscillator of Figure 13-14, choose wy = 22.2 MHz, which is approxi- 
mately halfway between the fundamental and the third harmonic. Choose C 
= 20 pF. Then, using (4), L = 2.6 pH. 


13.4.2 Reset Signal Generation 


The reset input controls initialization of internal TMS320C30 logic and also 
causes execution of the system initialization software. For proper system in- 
itialization, the reset signal must be applied at least ten H1 cycles, j.e., 600 ns 
for a TMS320C30 operating at 33.33 MHz. Upon powerup, however, it can 
take 20 ms or more before the system oscillator reaches a stable operating 
state. Therefore, the powerup reset circuit should generate a low pulse on the 
reset line for 100 to 200 ms. Once a proper reset pulse has been applied, the 
processor fetches the reset vector from location zero which contains the ad- 
dress of the system initialization routine. Figure 13-16 shows a circuit which 
will generate an appropriate powerup reset signal. 


TMS320C25 
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Figure 13-16. Reset Circuit 
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The voltage on the reset pin (RESET) is controlled by the R;C; network. After 


a reset, this voltage rises exponentially according to the time constant RjCy, 
as shown in Figure 13-17. 


VOLTAGE 
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Figure 13-17. Voltage on the TMS320C30 Reset Pin 
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The duration of the low pulse on the reset pin is approximately t7, which is the 
time it takes for the capacitor C;to be charged to 1.5 V. This is approximately 
the voltage at which the reset input switches from a logic 0 to a logic 1. The 
Capacitor voltage is given by: 


t 
V = Vec| 1-0-7 (7) 


where tT = R1Cy is the reset circuit time constant. Solving (7) for t gives: 


V 
t = -Ry1C; er! (8) 


Setting the following: 


Ry = 1 Ma 

C, = 0.47 uF 
Vec =5V 
V=V,=1.5V 


gives t = 167 ms. Therefore, the reset circuit of Figure 13-16 provides a low 
pulse of long enough duration to ensure the stabilization of the system oscil- 
lator upon powerup. 


Note that if synchronization of multiple TMS320C30’s is required, all proces- 
sors should be provided with the same input clock and the same reset signal. 
After powerup, when the clock has stabilized, all processors may then be 
synchonized by generating a falling edge on the common reset signal. Since 
it is in fact the falling edge of reset that establishes synchronization, reset must 
be high for a period of time (at least ten H1 cycles) initially. Following the 
falling edge, reset should remain low for at least ten H1 cycles and then be 
driven high. This sequencing of reset may be accomplished using additional 
circuitry, based on either RC time delays or counters. 
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13.5 XDS1000 Target Design Considerations 


The TMS320C30 Emulator is an Extended Development System (XDS1000), 
which has all the features necessary for full-speed emulation. The 
TMS320C30 uses a revolutionary technology to allow complete emulation via 
a serial scan path. If the user provides a 12 pin header on their target system, 
realtime emulation can be performed using the TMS320C30 device in their 
target system. Refer to Appendix B, Section B.1.4 for a complete description 
of the XDS1000. 


To use the emulation connector of the XDS1000, the signals shown in Figure 
13-18 should be provided to a 12 pin header (two rows of six pins) with pin 
8 cut out to provide keying. 


EMU1t GND 
emuot GND HEADER DIMENSIONS: 
EMU2t GND PIN TO PIN SPACING 0.100 IN. (X,Y) 
PD (+5 V) NO PIN (KEY) PIN WIDTH 0.025 IN. SQUARE POST 
PIN LENGTH 0.235 IN NOMINAL 
EMU3 GND 
H3 GND 


TOP VIEW 


TThese signals should always be pulled up with separate 20 kQ resistors to +5 volts on the TMS320C30. 


Figure 13-18. 12 Pin Header Signals 


Signal Description: 


EMUO Emulation pin 0O. 

EMU1 Emulation pin 1. 

EMU2 Emulation pin 2. 

EMU3 Emulation pin 3. 

H3 TMS320C30 H3 

PD Presence detect. It indicates that the cable is connected 
and target system is powered up. It 
should be tied to +5 volts in the target 
system. 
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Figure 13-19 is a diagram of the typical setup when using the emulation 
connection of the XDS1000. 
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Figure 13-19. Typical Setup For Using the Emulation Connection of the 
XDS1000 


For unbuffered signals, the distance between the TMS320C30 emulation pins 
(EMUO, EMU1, EMU2, EMU3, and H3) and the 12 pin header should be less 
than two inches. If the distance between the header and the TMS320C30 
emulation pins is more than two inches but less than six inches, the EMU3 
and H3 signals should be buffered. The buffer should be noninverting with a 
worst case propagation delay of 6.0 ns. For TMS320C30 emulation pins to 
12 pin header distances greater than six inches, all emulation signals should 
be buffered. Recall that EMUO, EMU1, and EMU2 are inputs and EMU3 and 
H3 are outputs. The buffer should have the same characteristics as given 
above. 
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Appendix A 
TMS320C30 Timing Specifications & Dimensions 


This section provides timing specifications and dimensions for the 
TMS320C30 (third-generation TMS320) processor. In order to provide in- 
formation in advance of the complete data sheet, this section is included. 
Characterization data on the TMS320C30 is still being collected. A complete 
data sheet with additional information will be available in the future. Please 
contact the local TI field sales office to obtain these data sheets. 
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Table A-1. Absolute Maximum Ratings Over Specified Temperature Range 


A-2 


Condition/Characteristic 
-0.3Vto7V 
-0.3Vto7V 


Notes: 


1) Stresses beyond those listed under ‘Absolute Maximum Ratings’ may 
cause permanent damage to the device. This is a stress rating only and 
functional operation of the device at these or any other conditions be- 
yond those indicated in the ‘Recommended Operating Conditions’ sec- 
tion of this specification is not implied. Exposure to 
absolute-maximum-rated conditions for extended periods may affect 
device reliability. 


2) All voltage values are with respect to Vss. 


Table A-2. Recommended Operating Conditions 


| __ Operating Condition | Min Nom Max __| Unit, 
Von Supply voltages (DDD. ete) ——=«dzpaS SC 
[vss Supply voltages (CVSS, ote) «| io SStCS~—CSV 
[Vins High-level input voltage ——=S~S~*~—~S CSSCSC*~“~*‘“‘ OT 
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Table A-3. Electrical Characteristics Over Specified Free-Air Temperature 
Range 


[Electrical Characteristic | Min Nom Max | Unit, 
Vou High-level output voltage (Vpp = Min,! oy = Max) 
Voi Low-level output voltage (Vpp = Min,| op = Max) 

|Ciinputcapacitance 8 


Notes: 


1) All typical values are at Vpp = 5 V, Ta = 25°C. 
2) =f, is the input clock frequency. The maximum value is 33.3 MHz. 


3) All input and output voltage levels are TTL compatible. 


2.15 V 
R, = 8250 


FROM OUTPUT 
UNDER TEST 
TEST POINT 


i Cy = 100 pF 


Figure A-1. Test Load Circuit AD 
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a im 


Figure A-2. X2/CLKIN Timing 


Figure A-3. H1/H3 Timing 
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Table A-4. Switching Characteristics for CLKIN, H1, and H3 


[ie [ene [teerpdon Fao Twp wer/unt 


(2) tw (CIL) CLKIN low pulse duration 
t.(Cl) = 30 ns 

(3) tyw(CIH) CLKiN high pulse duration 
= = 30 ns 


Ce oo eee CoE OSes | omORORE | 
rey [at | wivatan time «d+ Sdn 
[ey [atsty [wo ue acon [r=6 oe 
rf) ens cun f= 7 fo 


(9.1) TT HH) Delay from H1(H3) low to 
H3(H1) high 

(9.2) tq(HH - HL) Delay from H1(H3) high to 
H3(H1) low 


(10) H1/H3 cycle time 


Note: P = t,(Cl) 


Table A-5. Switching Characteristics for a memory ((M)STRB = OQ) read 


|No | Name |__—_—Description | Min Typ Max | Unit_ 

Paty [atic (MS) | A lowto (MJSTRB iow —*f O10 =| ne 

2) _[sa(H1L (MSH) | Ht lowto (W)STRB high | 0 —10~4(| ns _| 

[aay [sgt (O)RWL) | Ht highto (O)R/Wiow | 0 —-10~+d ns 

rv) [ratte (1)A)_[ Howto (I)Avaid Sit OSC ns 
15 ns 


(15) tsy((10)D)R (10)D valid before H1 low 
(read) 


COL 


(17) tsy((10) RDY) (10)RDY valid before H1 
high 
(18) th((LOCRDY) (10)RDY hold time after 
H1 high 
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L 
} 


Ye 9) 
| 
| 


| 
[ 
aig! baw ( 14) | | 


| —a| r= (13) 
(IO)A x ! x 
(19) 
= bw— (16) 
(10)D l 
(17)—— ~~ 


Figure A-4. Memory ((M)STRB = 0) Read 


Table A-6. Switching Characteristics for a memory ((M)STRB = 0) Write 


[ws [tame [beer Tin Wn [ni 
ia(HI - 1O)RWH) 


(20) ty((10) D) W (10)D valid after H1 low 
(write) 

(21) th((10)D)W (10)D hold time after H1 
high (write) 
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(M)STRB | 
{| (13) me | | (19) 
| | 
= | | 
wornw | | | 
: | 
14) l | | 
{(lO)A 
(20) —o4 [ (21) ‘ame 
(10)D 
(17) ~—— 
—wl  fj—(18) 


] 
(lIO)RDY \ / 


Figure A-5. Memory ((M)STRB = 0) Write 
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"/\S\SIO\ 


| (11.1) ee (12.1) oo 
| 
IOSTRB | 


= (22) | 


IOR/W | 
| 
| 


IOA 


= (16.1) 
(17.1) ‘ 
p——_eHt— (18.1) 


IORDY \ / 


Figure A-6. Memory ((M)STRB = 0) Read 
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Table A-7. Switching Characteristics for a Memory (IOSTRB = 0) Read 


ne [Name Dessription [in Typ Max unit 
a PE 

Ti12t) | tal 10H) | Ht high to TOSTRE gh [0 —10~(| ne 
ae a 


(15.1) tsy(IOD)R IOD valid before H1 high 
(read) 

(16.1) th(lOD)R 1OD hold time after H1 
high (read) 


(17.1) tsy IORDY) IORDY valid before H1 high 


(18.1) th (JORDY) |ORDY hold time after H1 a 
high 


Table A-8. Switching Characteristics for a Memory (lIOSTRB = 0) Write 


[we [Name [Description [| Min Typ Mex| Unit 
Fi lowio ((QR/Wiow fo 10 ~‘[ ms 


(20.1) ty(loOD)W 1OD valid before H1 low 15 
(write) 

(21.1) ty(l(OD)W 10D hold time after H1 low 
(write) 
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*/\_S\/\ 


1 (11.1) -—4 = 2s Fs 
[ | 
| 


lOSTRBE 
— be (23) | | 
loR/W | | 
—er| (aaa | l 
OA 
ke— (20.1) | — (21.1) 
10D 
(17.1) 
Liesl  je—(18.1) 


Figure A-7. Memory (IOSTRB = 0) Write 
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FETCH 
| = LoFtor ton DECODE | READ | EXECUTE | 


| 
| 
(MISTRB | | \ / 
| | 
| | 
ee, Lee ee a ee a ee eee eae 

(X)R/W | | 
| | 
| | 

- po XX 
| f 
| | 
| | 
(x)D | | 

| 
| | 
| | 
| 

(X)RDY | | 
| 
| (1)—ol = 
| | 

XFO PIN 2 on : ; 


XF1 PIN \ / 


Figure A-8. Timing for XFO and XF1 When Executing a LDFI or LDII 
Table A-9. Information for Figure A-8 


tg(H3H-XFOL) H3 high to XFO low re eee 
teu (XF1) XF1 valid before H1 low re 
th(XF1) XF1 hold time after Ht low | 0 | ns | 
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FETCH 
| strtorsmm | DECODE | READ | EXECUTE =| 


con \ ! a 
NED 


(X)D 


(X)RDY 


—o ke (1) 


XFO PIN 


Figure A-9. Timing for XFO When Executing a STF! or STII 


Table A-10. Information for Figure A-9 


[Ne. [Name | Description | Win Typ Max | Unit 
(1) tq(H3H-XFOH) H3 high to XFO high P10 ons 
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DECODE | READ | EXECUTE =| 


XFO 


Figure A-10. Timing for XFO and XF1 When Executing SIGI 


Table A-11. Information for Figure A-10 


(No. [Name [Description [Min Typ Max Unit 
ceo eC ae a 

CE TT 
Fay | taut) dO valid before Ht tow @ Od os 
Ty mrt) idx hota time ater wtiow [0 i os 
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FETCH LOAD 
| INSTRUCTION | | DECODE | READ | EXECUTE | 
H3 
F wad Nek ee ee 
| 
[ 
m\ S\N S\N SVS NS 
| 
ecg he RON 1 OR 0 


——{ (1) 


XF PIN Xi 


Figure A-11. Timing for Loading XF Register When Configured as an Output 
Pin 


Table A-12. Information for Figure A-11 


"No. [Name | Description [Min Typ Max] Unit 
ay aenaxr) | ne higmtoxrvaig «dP StS os 
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BUFFERS GO 
EXECUTE OF FROM OUTPUT eee VALUE ON PIN | 
LOAD OF IOF TO INPUT DELAY SEEN IN IOF 


H3 


~ 


KX) 
i KKK 


OY) 
xF PIN ouTPUT BRN 


¢, 
4 


XX) 


0-054 
i 


INXF BIT DATA 
SAMPLED 


Figure A-12. Change of XF From Output to Input Mode 


Table A-13. Information for Figure A-12 
| No. | Name | __—Description | Min Typ Max. 
from output to input 


tsy(XF) XF setup before H1 low 10 Fons | 
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EXECUTION OF 
| LoADOFIOF | | 


Figure A-13. Change of XF From Input to Output Mode 


Table A-14. Information for Figure A-13 


Min Typ Max 


le tana -XFIO) H3 high to XF switching 
from input to output 


Liv 


RESET At 7) 


NOTE 5 
(2) t-(3) 


(8) 
(X)D 
(NOTE 1) p-————___—_—_—_____—_____ 
—={ (9) { 
(X)A ARERR ors 
(NOTE 2) COO OOOOOOOOOOOOY 


CONTROL 0°0'0'9' 00790000098 088084 0784 
LSI KICKIN 
SIGNALS (NOTE 3) 


%.6.6,6.4.6.6,0 6.6.0 6.6.04 6,6 6.46.6.6.6 


IACK 
ASYNCHRONOUS (12) 
RESET SIGNALS pm nn 
(NOTE 4) 


NOTES: 1. (X)D includes D(31-0) and XD(31-0). 

2. X(A) includes A(23-0) and XA(12-0). 

3. Control signals include R/W, STRB, XR/W, MSTRB, and IOSTRB. 

4. Asynchronously reset signals include XF1, XFO, CLKXO, DXO, FSXO, CLKRO, DRO, FSRO, CLKX1, DX1, FSX1, CLKR1, DR1, 
FSR1, TCLKO, and TCLK1. 


5. RESET is an asynchronous input. 


Figure A-14. RESET Timing 


suoieoijioads Buu pue suoisuewiqg Q€Q0ZESWL - V X!pueddy 
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Table A-15. Information for Figure A-14 


(We. [Name | Deseription [in Typ Max 
(1) tsy (RESET) Setup for RESET 
before CLKIN low 

ty(CLKINH-HIH) CLKIN high to H1 high 


(3) tsy(RESETH-HIL) Setup for RESET high 
before H1 low and after 10 


H1 clock cycles 


tg(CLKINH-H3L) CLKIN high to H3 low 
tgig(H1 H-XD) H1 high to (X)D three state 
tgig( H3H-XA) H3 high to (X)A three state 


(10) tq(H3H- H3 high to control signals 
CONTROLH) high 


RESET low to 
asynchronously reset signals 
three state 


ty(H1H-IACKH) 


tgdis( RESETL- 
ASYNCH) 
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RESET OR FETCH FIRST 
INTERRUPT INSTRUCTION OF 
| VECTOR READ | Iservice ROUTINE 
H3 
| | 
| | 
| | 
"SS “af Ve VST CF VS VS 
H1 
| | 
——| m—(1) | 


[ 

[ 

[ 

{ 

| 

| 

{ 
ae | | 
INT(3-0) l l 
PIN l | 
| 
[ 
INT(3-0) | 
FLAG | 

| H 

| | 

| [ 

| 


Figure A-15. RESET and INT(3-0) Response Timing 


Table A-16. Information for Figure A-15 


[ne [nem [“oesrntion Tin Wo Max Uni 


tw(INT)Note 7 Interrupt pulse width to 1.5P <2P 
guarantee one interrupt seen 


Note 7: Interrupt pulse width must be at least 1 P wide to guarantee it will be seen. It must be less than 
2 P wide to guarantee it will be responded to only once. The recommended pulse width is 1.5 P. 
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FETCH IACK IACK DATA 
| INSTRUCTION | | READ | | 
"/\S\S\S\/I\ 


H1 


Figure A-16. [ACK Timing 


Table A-17. Information for Figure A-16 


[Wame | Description [Min Typ Max 


tq(H1H-IACKL) H1 high to TACK low Fo 0 fn 
(2) ty(H1 H-IACKH) H1 high to TACK high during 10 | 


first cycle of [ACK instruction 
data read 
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FETCH FIRST 
| | READ TRAP | INSTRUCTION OF 


| FETCH TRAP VECTOR | TRAP ROUTINE | 


H3 


FIRST INSTRUCTION 
VECTOR A 
ADDR CTO DDRESS ADDRESS 


Figure A-17. TRAP Response Timing 
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I 
| 
oe (Ch fe (8) —al be (4) i are 
| 
l 


DX 


xX \J LOCC CCEC OTC OEE OOEOEOOOEOOEEEEEON " CxXXKX KKK XXXX XX», xX KX 
PR AXLE XKKKXXEKKXX EKA EKAK IK KAKKY XRKKXXKKX XXX 


e KKXXK XK KX XXX XXX xX XXXKKKKKKKKKKXKKKKKK KKK KK KKK XK KKKKXK XK 
Ss 
a KKK KKK KKK KK IKK KKK KKK KKK KIRK 


I 
FSX(INT) ly y 
bar (11) 


ERK K ON 


SOM Orns, 


SX(EXT KKK XK KKXKKKKK KKK) KKK X XXX XXXKX 
a rrrrrtatatartarlalalalataterereretalararararararerererererererararerararaeiee 


Figure A-18. Fixed Data Rate Mode 
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ec-V 


| 
be—— (13) ——om | { 


a aD GD, 


© OXK KK X VV V xX VY XX XX xX xX \/ KXKXK KK KK KX V/ xX ~KKXKXKK KKK KKK KKK xX XxX 
RRRKKXK KIRK KAKAKN RANK XXXII 


(N\EMAACALAAENV NENA LVL NY (NEW MNS 


FSR 


RARER 


XK iN KKKKA 


SKKKKK KKK) 
OK 


CMY, 


v 
(AVAL XAAKKAKX 


(x \/ XXX KX 
DR ORY 


(VAAN 


MY 


NOTES: 1. Timing diagrams show operation with CLKXP =CLKRP =FSXP =FSRP =O. 
2. Timings not expressly specified for variable data rate mode are the same as those for fixed data rate mode. 


By Figure A-19. Variable Data Rate Mode 


suoieoyioeds Gulu), pue suoisuewiq Q€DQ0ZESWL - V Xipueddy 
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Table A-18. Serial Port Timing as Shown in Figures A-18 and A-19 


CLKX/R cycle CLKX/R external t -(H)x2.6 
time. CLKX/R internal | t,(H)x234 t o(H)x2 
CLKX/R high/ CLKX/R external t.(H)+5 


low pulsewidth CLKX/R internal | tp(SCK)/2 [t .(SCK)/2]-15 


CLKXt0 DX | CLKX extemal | 35 
val CLK intemal | 20 
DR setup before | CLKR external Pt 


(9) tq( FSX) CLKX to internal | CLKX external 
FSX. CLKX internal 


tsu(FSR) | FSRsotup | CLKRextermal | 
(11) th( FS) FSX/R input CLKX/R external Pt 


hold from CLKX/R internal 
CLKX/R. 
External FSX CLKX external ¢(CLKX)/2]-10 [t.(H)-8] 


setup before CLKX internal te(CLKX) /2 -[t .(H)-21] 
CLKX. 


35 
20 
10 
25 
32 
17 
10 
10 
10 
(t 10 -[te(H)- 


CLKX external 


DX bit, FSX CLKX internal 21 
precedes CLKX. 


(8) th(DR) DR hold from CLKR external 
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Figure A-20. HOLD/HOLDA Timing 


Table A-19. Information for Figure A-20 


[Ne [Name | Description | Min Max Unt 
Tay | (OCD) | OTB vat before tow [16 [re 
Cay | FOC) ———*Y HOC valid ater it ow [0 10. ns 


(7) tq(H1L-SH)H H1 low to STRB high for a 0 10 ns 
HOLD 


Te [rastics) | Wiow io STREthreosate [0 tof ns 
Ti) | tents) | Htiow to STRBacive [0 10| ns 
03 
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Figure A-21. TMS320C30 180 Pin PGA Dimensions 
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Appendix B 


Development Support/Part Order Information 


This section provides development support information, device part numbers, 
and support tool ordering information for the TMS320C30 (third-generation 
TMS320) processor. Figure B-1 shows the software and hardware develop- 
ment tools available and the development environment for the TMS320C30. 


B-1 


Appendix B - Development Support/Part Order Information 
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Source 


Archiver iiaptet 
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| Object Source 
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nom TMS320C30 nee 
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Figure B-1. TMS320C30 Development Environment 
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Extensive documentation, including data sheets, user’s guides, and application 
reports, is available to support DSP design. A series of DSP textbooks has 
been published both by Prentice-Hall and John Wiley and Sons to support 
research and education. Other support includes a technical support hotline 
(713-274-2320) and a bulletin board service (713-274-2323). TI’s Regional 
Technology Centers (RTCs) provide hands-on workshops and design ser- 
vices. 


Many third-parties and consultants with DSP expertise can assist in various 
application areas. TMS320C30 Algorithm Development Packages will be 
available from multiple third-parties and consultants in the near future. Sub- 
scribe to the DSP newsletter “Details on Signal Processing” for up to date 
information on new products and services from third-parties and consultants. 
Call TI's Customer Response Center at (800) 232-3200 to subscribe to the 
newsletter. Contact the nearest T/ field sales office for support tool availability 
or further details (see list of sales offices and distributors at end of book). 


The major topics discussed in this section are listed below. 


@ Pies 20Gs" Development Support (Section B.1 on page B-4) 
Macro Assembler/Linker 
= C Compiler 
= Simulator 
7 Extended Development System (XDS1000) 
- TMS320 DSP Hotline/Bulletin Board Service 


@® TMS320C30 Part Order Information (Section B.2 on page B-12) 
= Device part numbers 
= Software and hardware support tools part numbers 
= Device and support tool prefix designators 
= Device nomenclature 
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B.1 TMS320C30 Development Support 


Texas Instruments offers extensive development support and complete doc- 
umentation with the TMS320C30 (third-generation) digital signal processor. 
Tools are provided for the TMS320C30 to evaluate the performance of the 
processor, develop algorithm implementations, and fully integrate the design's 
software and hardware modules. Development operations are performed with 
the TMS320C30 Macro Assembler/Linker, C Compiler, Simulator, and Emu- 
lator (Extended Development System - XDS1000). 


A description and key features for each TMS320C30 development support 
tool is provided in the following subsections. For ordering information, see 
Section B.2. 


B.1.1 Macro Assembler/Linker 
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The TMS320C30 Macro Assembler/Linker is a software tool that converts 
source mnemonics to executable object code. 


The following key features’ distinguish the TMS320C30 Macro 
Assembler/Linker: 


e Macro capabilities and library functions 
Conditional assembly 

Relocatable modules 

Complete error diagnostics 

Symbol table and cross reference 


The TMS320C30 Macro Assembler/Linker is shipped with four programs to 
address specific needs. They are: 


1) The assembler 

2) The archiver 

3) The linker 

4) The object format converter 


These programs and their functionality are described in the following para- 
graphs. 


@ The assembler translates assembly language source files into machine 
language object files. Source files can contain instructions, assembler 
directives, and macro directives. Assembler directives can be used to 
control various aspects of the assembly process, such as the source list- 
ing format, data alignment, and section content. 


® The archiver allows collection of a group of files into a single archive 
file. For example, several macros can be collected together into a macro 
library. The assembler will search through the library and use the mem- 
bers that are called as macros by the source file. It is also possible to 
use the archiver to collect a group of object files into an object library. 
The linker will include the members in the library that resolve external 
references during the link. 


Appendix B - TMS320C30 Development Support 


@ The linker combines object files into a single executable object module. 
As it creates the executable module, it performs relocation and resolves 
external references. The linker accepts relocatable object files (created 
by the assembler) as input. It also accepts archive library members and 
Output modules created by a previous linker run. Linker directives allow 
combining of file sections, binding of sections or symbols to addresses, 
and defining of global symbols. 


So The main purpose of this development process is to produce a module 
that can be executed in a system that contains a TMS320C30 device 
or the software or hardware development tools. (Note that only 
linked files can be executed). 


@ Most EPROM programmers do not accept assembler/linker files as input. 
The object format converter converts the object file into Intel, Tek- 
tronix, or Tl-tagged object format. The converted file can be down- 
loaded to an EPROM programmer. This EPROM code can then be 
executed on the TMS320C30 device. 


Refer to Figure B-1 for a diagram of the development environment when using 
the Assembler/Linker. 


The macro assembler/linker is currently available for PC/MS-DOS, VAX VMS, 
SUN-3 UNIX , and VAX ULTRIX operating systems. 


B.1.2 C Compiler 


The optimizing C compiler is a full implementation of the standard Kernighan 
and Ritchie C. The compiler accepts a digital signal processing program 
written in C language. It outputs TMS320C30 assembly language source 
code which is then processed by the assembler where the TMS320C30 mne- 
monics are converted to object code. 


This high-level language compiler allows time-critical routines written in as- 
sembly language to be called from within the C program. The converse is also 
available; assembly routines may call C functions. The output of the compiler 
can be edited prior to assembly/link to further optimize the performance of the 
code. The compiler supports the insertion of assembly language code into C 
source code. The result is a compiler that allows the relative amounts of 
high-level programming and assembly language code to be tailored according 
to the application. Refer back to Figure B-1 for a diagram of the development 
environment when using the C compiler. 


The compiler is currently available for PC/MS-DOS, VAX VMS, SUN-3 UNIX, 
and VAX ULTRIX operating systems. The assembler/linker is included with the 
shipment of the TMS320C30 C compiler. The output of this assembler/linker 
can be downloaded and used with the simulator, XDS, or PROM programmer. 
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B.1.3 Simulator 
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The TMS320C30 Simulator is a software program that simulates operation of 
the TMS320C30. 


The following features highlight simulator capability for effective TMS320C30 
software development: 


@ Simulates the entire TMS320C30 digital signal processor instruction set 


® Simulates the key TMS320C30 peripheral features (DMA, timers, and 
serial port) 


@ Command entry from either menu-driven keystrokes (menu mode) or 
from a batch file (line mode) 


® Help menus for all screen modes 
@ Standard interface can be user customized 


@ Simulation parameters quickly stored/retrieved from files to facilitate 
preparation for individual sessions 


& Reverse assembly allows editing and re-assembly of source statements 


@ Memory can be displayed (at same time) as: 
- hexadecimal 32-bit values 
- assembled source 


@ Execution modes include: 
= single/multiple instruction count 
= single/multiple cycle count 
a until condition is met 
- while condition exists 
as for set loop count 
= unrestricted run with halt by key input 


@ Easy to define trace expressions 


® Trace execution with display choices of: 
= designated expression values 
= cache registers 
- instruction pipeline for easy optimization of code 


e Breakpoint conditions include: : 
- address read 
~ address write 
- address read or write 
-~ address execute 
= expression valid 


® Simulates cache utilization 


®@ Cycle counting 
= display the number of clock cycles in single step or run mode 
- external memory can be configured with wait states for accurate 
cycle counting 


The simulator allows verification and monitoring of the state of the processor. 
Simulation speed is on the order of thousands of instructions per 
second(VAX/VMS, VAX/ULTRIX, and SUN-3 UNIX) or hundreds of in- 
structions per second (PC/MS-DOS). 
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The simulators use TMS320C30 object code, produced by the Macro 
Assembler/Linker. Input and output files may be associated with the port 
addresses of the I/O instructions in order to simulate 1/O devices connected 
to the processor. Before initiating program execution, breakpoints may be set, 
and the trace format defined. 


During program execution, the internal registers and memory of the simulated 
TMS320C30 are modified as each instruction is interpreted by the host com- 
puter. Execution is suspended when one of the following conditions exists: 


1) A breakpoint or error is encountered. 
2) Execution is halted. 


Once program execution is suspended, the internal registers and both program 
and data memories can be inspected and/or modified. The trace memory can 
also be displayed. A record of the simulation session can be maintained in a 
journal file, so that it can be re-executed to regain the same machine state 
during another simulation session. 


The user interface in the simulator is identical to that in the XDS. See Figure 
B-2 for an example of the user interface. 


B-7 


Appendix B - TMS320C30 Development Support 


CODE WINDOW COMMAND LINE 
Shows source code (LO command loads code from <filename>) 


ENABLED PF KEYS REGISTER DISPLAY WINDOWS 


DISPLAY WINDOW 
Shows: Banner (DC command) 


Expressions (DE commands) 
Files (DF command) 
Memory (DM command) 
Symbols (DS command) 
(And other displays.) 


Figure B-2. TMS320C30 Simulator User Interface 


The simulator is currently available from Tl for PC/MS-DOS, VAX VMS, and 
VAX ULTRIX operating systems. A SUN-3 UNIX version of the simulator can 


be purchased from a third party: Spectron Microsystems Inc. This version is 


the same as TI’s simulator for the PC/MS-DOS, VAX VMS, and VAX ULTRIX. 
Contact Spectron at (805 967-0503) for more information. 
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B.1.4 TMS320C30 Emulator - Extended Development System (XDS1000) 


The TMS320C30 Emulator (XDS1000) is a user-friendly system that has all 
the features necessary for full-speed emulation to debug hardware, software, 
or integrate the software with the hardware. Some of the XDS1000’s features 
include: 


@ Full-speed execution and monitoring out of the customers target system 
via a 12 pin target connector 


Software breakpoint 

Software trace 

Software timing capabilities 

Single-step execution 

Inspect/modify registers and program/data memory 
Upload/download capabilities to/from data/program memory 
@ Windowed user interface similar to the TMS320C30 simulator 


Full-speed execution and monitoring of the customers target system via a 12 
pin target connector has the advantage of using a serial scan path to give ac- 
cess to the internal registers as well as internal and external memory of the 
device. Since execution is out of the TMS320C30 located in the target sys- 
tem, there is no timing difference during emulation. 


Software breakpoints means the program can be stopped on a specific ad- 
dress. When the program counter reaches the designated breakpoint address, 
the emulator will halt execution and allow the user to observe the status of the 
TMS320C30 (i.e., inspect memory or registers). Software trace allows view- 
ing of the TMS320C30’s state when a breakpoint is reached. This information 
can be saved to a file for future analysis. Software timing permits keeping 
track of clock ticks between breakpoints or while program single stepping. 


The XDS1000 consists of two full-size PC-XT/AT cards. One card is the 
TMS320C30 XDS1000 Controller Card, the other is the TMS320C30 
XDS1000 Development Board. 


The TMS320C30 XDS1000 Controller Card is responsible for interpreting 
commands sent from the PC and converting those commands into appropriate 
signal sequences to control the TMS320C30 in the user’s target system. 


The TMS320C30 XDS1000 Development Board is a predefined target system 
that contains: 


® A TMS320C30 device 
S 16K x 32-bits full-speed (zero wait state) SRAM on the primary bus 


@ Two selectable banks of 8K x 32-bits full-speed (zero wait state) SRAM 
on the expansion bus 


See figure Figure B-3 for a visual representation of the TMS320C30 


XDS1000's development environment. 
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TMS320C30 XDS1000 TMS320C30 XDS1000 Controller Card 
Controller Card 


TMS320C30 XDS1000 T] 
Development Board 


TMS320C30 
User Target System 


[_] TMS320C30 


Emulation 
Connector User Memory and 1/O 


Hardware/Software Algorithm Development 
Development Environment Environment 


Figure B-3. TMS320C30 XDS1000 Development Environment 


This figure shows the two environments in which the TMS320C30 XDS1000 
can operate: 


1) The hardware/software development configures the TMS320C30 
XDS1000 and the user’s target system in the emulator mode. Section 
13.5 of this document shows the 12-pin header or emulator connector 
necessary for the user's target system to work with the TMS320C30 
XDS1000. 


2) The algorithm development environment allows the user to debug his 
software before the user's target system is built. In this configuration, 
the TMS320C30 XDS1000 Development Board can be used in place of 
the user's target system. In this mode, code can be downloaded into the 
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memory on the TMS320C30 XDS1000 Development Board and execute 
at full speed. 


To use the TMS320C30 XDS1000, the following equipment is required: 


® IBM PC-XT/AT compatible 


@ Two and one-half eight-bit slots for the PC-AT, three full-size eight-bit 
slots for the PC-XT 


® A minimum of 640K bytes of memory in the PC 
@ PC/MS DOS rev 2.0 or later 


In summary, the TMS320C30 XDS1000 is a full-speed emulator that comes 
with a pre built target system for early design development. The TMS320C30 
XDS1000 can help debug hardware in realtime, debug software in realtime, 
and integrate the hardware and software together. 


B.1.5 TMS320 DSP Hotline/Bulletin Board Service 


The TMS320 group at Texas Instruments provides a DSP Hotline to answer 
TMS320 technical questions such as device problems, development tools, 
documentation, upgrades, and new TMS320 products. The hotline is open 
five days a week from 8:00 AM to 6:00 PM Central Time. The phone number 
is (713) 274-2320. For pricing and availability of TMS320 devices and de- 
velopment tools, contact the nearest T! sales office. To order literature, call the 
Customer Response Center (CRC) at (800) 232-3200. 


The TMS320 DSP Bulletin Board Service is a telephone-line computer bulletin 
board that provides access to information pertaining to TMS320 devices. 
Specification updates for current or new TMS320 devices and development 
tools are communicated via the bulletin board as the information becomes 
available. The Bulletin Board Service can be accessed by dialing (713) 
274-2323 with a 300, 1200, or 2400-bps modem. 


The bulletin board contains TMS320C30 source code from Section 12 of the 
TMS320C30 Users Guide as well as development tool and silicon revisions 
and enhancements. The bulletin board also provides new DSP application 
software as it becomes available. See the 7MWS320 Family Development Sup- 
port Reference Guide for further information on how to access the bulletin 
board. 
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B.2 TMS320C30 Part Order Information 


This section provides the device and support tool part numbers. Table B-1 
lists the part numbers for the TMS320C30, and Table B-2 gives ordering in- 
formation for TMS320C30 hardware and software support tools. A discussion 
of the TMS320 family device and development support tool prefix designators | 
is included to assist in understanding the TMS320 product numbering system. 


Table B-1. TMS320C30 Digital Signal Processor Part Numbers 


OPERATING PACKAGE TYPICAL 
DEVICE TECHNOLOGY FREQUENCY TYPE DISSIPATION 


tTMX320C30GBH | 1.0-umCMOS | 33MHz | Ceramic 180-pin PGA 


TMilitary version planned; contact nearest sales office for availability. 


Table B-2. TMS320C30 Support Tool Part Numbers 


TOOL DESCRIPTION OPERATING SYSTEM] PART NUMBER 
— SOFTWARE 


Macro Assembler/Linker VAX VMS TMDX3243250-08 
PC/MS-DOS TMDX3243850-02 
SUN-3 UNIX * TMDX3243550-08 
VAX ULTRIX TMDX3243260-08 


VAX VMS TMDX3243255-08 
PC/MS DOS | TMDX3243855-02 
SUN-3 UNIX * TMDX3243555-08 
VAX ULTRIX TMDX3243265-08 


VAX VMS TMDX3243251 -08 
PC/MS-DOS TMDX3243851 -02 
SUN-3 UNIX Offered by 
Spectron Inc. 
(805) 967-0503 
VAX ULTRIX TMDX3243261 -08 


HARDWARE 
XDS1000 PC/MS-DOS TMDX3261030 


* Please note SUN UNIX support for TMS320C30 software tools is for the 68000 family 
based SUN-3 series workstations. These tools are NOT SUPPORTED on the SUN-4 
series machines that use the SPARC processor, or the SUN-386i series of workstations. 


C Compiler & Macro Assembler/ 
Linker 


Simulator 
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B.2.1 Device and Development Support Tool Prefix Designators 


To assist the user in understanding the stages in the product development 
cycle, Texas Instruments assigns prefix designators in the part number no- 
menclature. A device prefix designator has three options: TMX, TMP, and 
TMS, and a development support tool prefix designator has two options: 
TMDX and TMDS. These prefixes are representative of the evolutionary stages 
of product development from engineering prototypes (TMX/TMDX) through 
fully qualified production devices (TMS/TMDS). This development flow is 
defined below. 


Device Development Evolutionary Flow: 


TMX = Experimental device that is not necessarily representative of the final 
device’s electrical specifications. 


TMP _ Final silicon die that conforms to the device’s electrical specifications 
but has not completed quality and reliability verification. 


TMS — Fully qualified production device. 


Support Tool Development Evolutionary Flow: 


TMDX Development support product that has not yet completed Texas In- 
struments internal qualification testing. 


TMDS Fully qualified development support product. 


TMX and TMP devices and TMDX development support tools are shipped 
with the following disclaimer: 


“Developmental product is intended for internal evaluation purposes.” 


Note: 


Texas Instruments recommends that prototype devices (TMX or TMP) not 


be used in production systems since their expected end-use failure rate is 
undefined but predicted to be greater than standard qualified production 
devices. 


TMS devices and TMDS development support tools have been fully charac- 
terized and the quality and reliability of the device has been fully demon- 
strated. Texas Instruments standard warranty applies. 
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B.2.2 Device Nomenclature 


In addition to the prefix, the device nomenclature includes a suffix that follows 
the device family name. This suffix indicates the package type (e.g., N, FN, 
or GB) and temperature range (e.g., L). Figure B-4 provides a legend for 
reading the complete device name for any TMS320 family member. 


TMX 320 C 30 GB H 


TMX = experimental device 
TMP = prototype device 
TMS = qualified device 
SMJ = MIL-STD-883C 


PREFIX | | TEM 


DEVICE FAMILY 
320 = TMS320 family 


TECHNOLOGY 
C = CMOS 

E = CMOS EPROM 
No letter = NMOS 


DEVICE 
1st-gen. DSP: 
10 
15 
17 
2nd-gen. DSP: 
20 


25 
3rd-gen. DSP: 
30 


PERATURE RANGE 


H 0 to 50°C 

L 0 to 70°C 

S = -55 to 100°C 
M = -55 to 125°C 
A = -40 to 85°C 


PACKAGE TYPE 

N = plastic DIP 

JD = ceramic DIP 
side-brazed 

FN = plastic leaded CC 

GB = ceramic PGA 

FJ = ceramic leaded CC 

FD = leadless ceramic CC 


Figure B-4. TMS320 Device Nomenclature 
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Instruction Opcodes 


The opcode fields for all the TMS320C30 instructions are shown in Table C-1. 
Bits in the table marked with a hyphen are defined in the individual instruction 
description (see Section 11). Table C-1 along with the instruction de- 
scriptions fully define the instruction words. The opcodes are listed in nu- 
merical order. 


Table C-1. TMS320C30 Instruction Opcodes 


| INSTRUCTION | 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 
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Table C-1. TMS320C30 Instruction Opcodes (Continued) 
| INSTRUCTION | 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23° 
ee Oe 


NO 


a) 


POPF 
PUSH 
PUSHF 


8) 


O 
ROLC 


fr 


RORC 
RPTS 
STF 
STFI 
ST 
STII 
SIGI 
SUBB 
SUBC 
SUBF 
SUBI 
SUBRB 
SUBRF 1 1 
SUBRI rofofs{+}ofol+|i 
tsa fotofo}+{1{o}ijfo]o 
[xorfofofol1{[1}o]1]o]1 
[ack fofofol1[1}ol1{1]o 
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Table C-1. TMS320C30 Instruction Opcodes (Continued) 


INSTRUCTION | 31 | 30 | 29 | 20 | 27 | 26 | 26 | 24 | 23 


pas) 20] 202] 20 | ea 

Pca Pots fa foftofotit -T-| 
ES 
Fac OF eee 
oueererre ert arr teess te 
| psconaoyt | ot s{i1}otrja}-j|-] 
| caticond [oli tr{1f}oj;ot-] - | - 
| tRarcond [| o}|i}1}i}oliajo] -| - 
ed eee eee ae 
|_RETScond _| FARMER ERERESERESES 


MPYF3||ADDF3 ABH 
MPYF3||SUBF3 He 


MPYI3||ADDI3 


| MPYI3||SUBI3 
: 


ae 


— 2s 
ooo 


oOooo};}ooo°o 


1 0 0 
1 0 1 
1 0 0 
1 0 1 
1 1 0 
1 1 1 
1 1 0 
fe aa = 


pm one POP COTE rere rr: 0c. 
p tryLOF OT tT tp o}ofyottjo} - | -| 


T Opcode same for standard and delayed instructions. 


me sare on 


aah 
oOo 
Ooo 


awk 


oOo 
lofoooe 
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Table C-1. TMS320C30 Instruction Opcodes (Concluded) 


[instRUCTION [31] 90] 29 | 28 | 27 | 26 | 25 | 24 [23 
[toot ft fT 
[—oraiis 


OR3||STI 


Reserved for reset, 
traps, and interrupts 
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The quality and reliability performance of Texas Instruments Microprocessor 
and Microcontroller Products, which includes the three generations of 
TMS320 digital signal processors, relies on feedback from: 


@ Our customers 


®@ Our total manufacturing operation from front-end wafer fabrication to 
final shipping inspection 


@ Product quality and reliability monitoring. 


Our customer's perception of quality must be the governing criterion for 
judging performance. This concept is the basis for Texas Instruments Corpo- 
rate Quality Policy, which is as follows: 


“For every product or service we offer, we shall define the require- 
ments that solve the customer's problems, and we shall conform to 
those requirements without exception.” 


Texas Instruments offers a leadership reliability qualification system, based on 
years of experience with leading-edge memory technology as well as years of 
research into customer requirements. Quality and reliability programs at Tl are 
therefore based on customer input and internal information to achieve con- 
stant improvement in quality and reliability. 


Reliability Stress Tests 


D.1 Reliability Stress Tests 


D-2 


Accelerated stress tests are performed on new semiconductor products and 
process changes to ensure product reliability excellence. The typical test en- 
vironments used to qualify new products or major changes in processing are: 


High-temperature operating life 

Storage life 

Temperature cycling 

Biased humidity 

Autoclave 

Electrostatic discharge 

Package integrity 

Electromigration 

Channel-hot electrons (performed on geometries less than 2.0 ym). 


Typical events or changes that require internal requalification of product in- 
clude: 


@ New die design, shrink, or layout 


@ Wafer process (baseline/control systems, flow, mask, chemicals, gases, 
dopants, passivation, or metal systems) 


@ Packaging assembly (baseline control systems or critical assembly 
equipment) 

@ Piece parts (such as lead frame, mold compound, mount material, bond 
wire, or lead finish) 


@ Manufacturing site. 


TI reliability control systems extend beyond qualification. Total reliability 
controls and management include product reliabily monitor as well as final 
product release controls. MOS memories, utilizing high-density active ele- 
ments, serve as the leading indicator in wafer-process integrity at Tl MOS fa- 
brication sites, enhancing all MOS logic device yields and reliability. Tl places 
more than several thousand MOS devices per month on reliability test to en- 
sure and sustain built-in product excellence. 


Table D-1 lists the microprocessor and microcontroller reliability tests, the 
duration of the test, and sample size. The following defines and describes 
those tests in the table. 


AOQ (Average Outgoing Quality) Amount of defective product in a pop- 
ulation, usually expressed in terms of 
parts per million (PPM). 


FIT (Failure In Time) Estimated field failure rate in number 
of failures per billion power-on device 
hours; 1000 FITS equals 0.1 percent 
fail per 1000 device hours. 


Operating lifetest Device dynamically exercised at a high 
ambient temperature (usually 125°C) 
to simulate field usuage that would 
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High-temperature storage 


Biased humidity 


Autoclave (pressure cooker) 


Temperature cycle 


Thermal shock 


PIND 


Mechanical Sequence: 
Fine and gross leak 
Mechanical shock 


PIND (optional) 
Vibration, variable frequency 


Constant acceleration 


Fine and gross leak 


expose the device to a much lower 
ambient temperature (such as 55°C). 
Using a derived high temperature, a 
55°C ambient failure rate can be cal- 
culated. 


Device exposed to 150°C unbiased 
condition. Bond integrity is stressed in 
this environment. 


Moisture and bias used to accelerate 
corrosion-type failures in plastic 
packages. Conditions include 85°C 
ambient temperature with 85-percent 
relative humidity (RH). Typical bias 
voltage is +5 V and ground on alter- 
nating pins. 


Plastic-packaged devices exposed to 
moisture at 121°C using a pressure of 
one atmosphere above normal pres- 
sure. The pressure forces moisture 
permeation of the package and accel- 
erates corrosion mechanisms (if pres- 
ent) on the device. External package 
contaminates can also be activated 
and caused to generate inter-pin cur- 
rent leakage paths. 


Device exposed to severe temperature 
extremes in an alternating fashion 
(-65°C for 15 minutes and 150°C for 
15 minutes per cycle) for at least 1000 
cycles. Package strength, bond qual- 
ity, and consistency of assembly pro- 
cess are stressed in this environment. 


Test similar to the temperature cycle 
test, but involving a liquid-to-liquid 
transfer, per MIL-STD-883C, Method 
1011. 


Particle Impact Noise Detection test. 
A non-destructive test to detect loose 
particles inside a device cavity. 


Per MIL-STD-883C, Method 1014.5 
Per MIL-STD-883C, Method 2002.3, 
1500 g, 0.5 ms, Condition B 

Per MIL-STD-883C, Method 2020.4 
Per MIL-STD-883C, Method 2007.1, 
20 g, Condition A 

Per MIL-STD-883C, Method 2001.2, 
20 kg, Condition D, Y1 Plane min 
Per MIL-STD-883C, Method 1014.5 
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Electrical test 


Thermal Sequence: 


Fine and gross leak 
Solder heat (optional) 
Temperature cycle 
(10 cycles minimum) 
Thermal shock 

(10 cycles minimum) 
Moisture resistance 
Fine and gross leak 
Electrical test 


Thermal/Mechanical Sequence: 


Fine and gross leak 

Temperature cycle 
(10 cycles minimum) 
Constant acceleration 


Fine and gross leak 
Electrical test 


Electrostatic discharge 
Solderability 

Solder heat 

Salt atmosphere 

Lead pull 

Lead integrity 


Electromigration 


Resistance to solvents 


To data sheet limits 


Per MIL-STD-883C, Method 1014.5 
Per MIL-STD-750C, Method 1014.5 
Per MIL-STD-883C, Method 1010.5, 
-65 to +150°C, Condition C 

Per MIL-STD-883C, Method 1011.4, 
-55 to +125°C, Condition B 

Per MIL-STD-883C, Method 1004.4 
Per MIL-STD-883C, Method 1014.5 
To data sheet limits 


Per MIL-STD-883C, Method 1014.5 
Per MIL-STD-883C, Method 1010.5, 
-65 to +150°C, Condition C 

Per MIL-STD-883C, Method 2001.2, 
30 kg, Y1 Plane 

Per MIL-STD-883C, Method 1014.5 
To data sheet limits 


Per MIL-STD-883C, Method 3015 
Per MIL-STD-883C, Method 2003.3 
Per MIL-STD-750C, Method 2031, 
10 sec 

Per MIL-STD-883C, Method 1009.4, 
Condition A, 24 hrs min 

Per MIL-STD-883C, Method 2004.4, 
Condition A 

Per MIL-STD-883C, Method 2004.4, 
Condition B1 

Accelerated stress testing of con- 
ductor patterns to ensure acceptable 
lifetime of power-on operation 

Per MIL-STD-883C, Method 2015.4 
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Table D-1. Microprocessor and Microcontroller Tests 


TEST DURATION SAMPLE SIZE 
PLASTIC CERAMIC 


Operating life, 125°C, 5.0 V 1000 hrs 
Operating life, 150°C, 5.0 V 1000 hrs 
Storage life, 150°C 1000 hrs 
Biased 85°C/85 percent RH, 5.0 V 1000 hrs 
Autoclave, 121°C, 1 ATM 240 hrs 
Temperature cycle, -65 to 150°C 1000 cyct 
Temperature cycle, 0 to 125°C 3000 cyc 
Thermal shock, -65 to 150°C 200 cyc 
Electrostatic discharge, +2 kV 

Latch-up (CMOS devices only) 

Mechanical sequence 

Thermal sequence 

Thermal/mechanical sequence 

PIND 

Internal water vapor 

Solderability 

Solder heat 

Resistance to solvents 

Lead integrity 

Lead pull 

Lead finish adhesion 

Salt atmosphere 

Flammability (UL94-V0) 

Thermal impedance 


“If junction temperature does not exceed plasticity of package. 
TFor severe environments; reduced cycles for office environments. 


Table D-2 lists the TMS320C30 device, the approximate number of transis- 
tors, and the equivalent gates. The numbers have been determined from design 
verification runs. 


Table D-2. TMS320C30 Transistors 


DEVICE # TRANSISTORS # GATES 
CMOS: TMS320C30 600K-700K 200K 


TI Qualification test updates are available upon request at no charge. TI will 
consider performing any additional reliability test(s), if requested. For more 
information on TI quality and reliability programs, contact the nearest TI field 
sales office. 
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Note: 


Texas Instruments reserves the right to make changes in MOS Semicon- 
ductor test limits, procedures, or processing without notice. Unless prior 
arrangements for notification have been made, TI advises all customers to 
reverify current test and manufacturing conditions prior to relying on 
published data. 
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Schweber (619) 450-0454; Wyle (619) 565-9171; 

San Francisco ae Area: Arrow/Kierulff (408) 745-6600, 
-0900; Marshal! (408) 942-4600; 

Schweber (408) 432-71 71; Wyle (408) 727-2500; 

Zeus (408) 998-5121. 


COLORADO: Arrow/Kierulff (303) 790-4444; 
Hall-Mark (303) 790-1662; Marshall (303) 451-8383; 
Schweber (303) 799-0258; Wyle (303) 457-9953. 


CONNETICUT: Arrow/Kierulff (203) 265-7741; 
Hall-Mark (203) 269-0100; Marshall (203) 265-3822; 


| Schweber (203) 748-7080. 


FLORIDA: Ft. Lauderdale: 

Arrow/Kierulff (305) 429-8200; Hall-Mark (305) 971 -9280; 
Marshall (305) 977-4880; Schweber (305) 977-7511; 
Orfando: Arrow/Kierulff (305) 725-1480, (305) 682-6923; 
Hall-Mark (305) 855-4020; Marshall (305) 767-8585; 
Schweber (305) 331 -7555; seed (305) 365-3000; 
Tampa: Hall-Mark (813) 53 

Marshall (813) 576-1399. 


GEORGIA: Arrow/Kieru'ff (4 
Hall-Mark (404) 447-8000; 
Schweber (404) 449-9170. 


ILLINOIS: Arrow/Kierulff (312) 250-0500; 
Hall-Mark (312) 860-3800; Marshall (312) 490-0155; 
Newark (312) 784-5100; Schweber (312) 364-3750. 


INDIANA: indianapolis: Arrow/Kierulff (317) 243-9353; 
Hall-Mark (317) 872-8875; Marshall (317) 297-0483. 


JOWA: Arrow/Kierulff (319) 395-7230; 
Schweber (319) 373-1417. 


KANSAS: Kansas City: Arrow/Kieruiff 13) 541-9542: 
Nan Mark (913) poole Hela (913) 492-3121; 
Schweber (913) 492-2922 


MARYLAND: Arow/Kierutt (301) 995-6002; 
Hall-Mark (301) 988-9800; Marshall (301) 840-9450; 
Schweber (301) 840-5900; Zeus (201) 997-1118. 


404) 449-8252; 
jarshall (404) 923-5750; 
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MASSACHUSETTS Arrow/Kierulff (617) 935-5134; 
Hall-Mark (617) 667-0902; Marshall 617) 4 658-0810; 
Schweber (617) 275-5100, (617) 657-0 

Time (617) 532-6200; Zeus (617) 863- 3800. 


MICHIGAN: Detroit: Arrow/Kierulf (313) An 
Marshall (313) rane Newark (313) 967-0600 
Schweber (313) 525- 8100; 

Grand Rapids: Arrow/Kierultt (616) 243-0912. 


MINNESOTA: Arrow/Kierulff (612) 830-1800; 
Hall-Mark (612) con Marshall (612) 559-2211; 
Schweber (612) 941-5280. 


MISSOQUATI: St. Louis: Arrow/Kierulff (314) 567-6888; 
Hall-Mark (314) 291-5350; Marsha!) (314) 291-4650; 
Schweber (314) 739-0526. 


NEW HAMPSHIRE: Arrow/Kierulff (603) 668-6968; 
Schweber (603) 625-2250. 


NEW JERSEY: Arrow/Kierulff (201) 538-0900, 
(609) 596-8000; GRS Electronics (609) 964- 8560; 
Hall-Mark (201) 575-4415, (609) 235-1900; 

Marshal! (201) 882-0320, (609) 234-9100; 
Schweber (201) 227-7880. 


NEW MEXICO: Arrow/Kierutff (505) 243-4566. 


NEW YORK: Long Island: 

Arrow/Kierulff (516) 231-1000; Hall-Mark (516) 737-0600; 
Marshall (516) 273-2424; Schweber (516) 334-7555; 
Zeus (914) 937-7400; 

Rochester: Arrow/Kierulff (716) 427-0300; 

Hall-Mark (716) 244-9290; Marshall (716) 235-7620; 
Schweber (716) 424-2222; 

Syracuse: Marshall (607) 798-1611. 


NORTH CAROLINA: Arrow/Kierulff pe) 876-3132, 
(919) 725-8711; Hall-Mark (919) 872-07 
Marshall (919) 878-9882; Schweber (91 3) 876-0000. 


OHIO: Cleveland: Arrow/Kierulff (216) 248-3990; 
Hall-Mark (216) 349-4632; Marshall (216) 248-1 788; 
Schweber (216) 464-29 2970; 

Cotumbus: Arrow/Kierultt (614) 436-0928; 
Hall-Mark (614) 688-33 

Dayton: Arrow/Kierulft (613) 495-5 

Marshall (513) 898-4480; Scineber (513) 439-1800. 


OKLAHOMA: ayo tg (918) 252-7537; 
Schweber (918) 622-800: 


OREGON: pases 503) 645-6456; 
Marshall (503) 644-5050; Wyle (503) 640-6000. 


PENNSYLVANIA: Arrow/Kierulff (412) 856-7000, 
eS 928-1800; GRS Electronics Aah 922-7037; 
chweber (215) 441-0600, (412) 963-6804. 


TEXAS: Austin: Arrow/Kierulff (512) 835-4180; 
Hail-Mark (512) 258-8848; Marshall (512) 837- ty 
Schweber (512) 339-0088; Wyle (512) 834-995 

Dallas: Arrow/Kierulff (214) 380-6464; 

Hall-Mark (214) 553-4300; Marshall (214) 233-5200; 
Schweber (214) 661- 5010; Wyle (214) 235-9953; 
Zeus (214) 783-7010; 
Houston: Arrow/Kierulft (713) 530-4700; 

Hall-Mark (713) 781-6100; Marshall (713) 895-9200; 
Schweber (713) 784-3600; Wyle (713) 879-9953. 


UTAH: Arrow/Kierulff (801) 973-6913; 
Hall-Mark (801) 972-1008; Marshall (801) 485-1551; 
Wyle (801) 974-9953. 


WASHINGTON: Arrow/Kierulff (206) 575-4420; 
Marshall (206) 747-9100; Wyle (206) 453-8300. 


WISCONSIN: Arrow/Kierulff (414) 792-0150; 
Hall-Mark (414) 797-7844; Marshall (414) 797-8400; 
Schweber (414) 784-9020. 


CANADA: Caigary: Future Apr 235-5325; 
Edmonton: Future (403) 438-2858; 
Montreal: Arrow Canada (514) 735-5511; 
Future (514) 694-7710; 

Ottawa: Arrow Canada (613) 226-6903; 
Future (613) 820-8313; 

Quebec City: Arrow Canada (418) 687-4231; 
Toronto: Arrow Canada (416) 672-7769; 
Future (416) 638-4771; 

Vancouver: Future (604) 294-1166; 
Winnipeg: Future (204) 339-0554. 


Customer 
Response Center 


TOLL FREE: (800) 232-3200 


OUTSIDE USA: had 995-6611 
(8:00 a.m. — 5:00 p.m. CST) 
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